-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
I was able to reproduce similar errors while running the test org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle as the errors mentioned in this previous PR for org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEnd here: #17264.
Digging deeper I found the root cause to be line 227 in MovingAverageIterable.java where the averagerKey.iterator() is used to traverse the Set<Map<String,Object>> averagerKeys. The iteration order of the Set is not guaranteed to have any specific ordering. For example if I modify lines 225 like such:
- Set<Map<String, Object>> averagerKeys = new HashSet<>(averagers.keySet());
+ Set<Map<String, Object>> averagerKeys = new HashSet<>(101); // Set initial capacity
+ averagerKeys.addAll(averagers.keySet());
averagerKeys.removeAll(seenKeys);
averagersKeysIter = averagerKeys.iterator();
cacheIter = null;Initializing the HashSet to some initial capacity (101), both the extensions-contrib/moving-average-query,org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle and org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEndfail consistently with the following errors:
[INFO] Running org.apache.druid.query.movingaverage.MovingAverageIterableTest
[ERROR] Tests run: 4, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 0.811 s <<< FAILURE! -- in org.apache.druid.query.movingaverage.MovingAverageIterableTest
[ERROR] org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle -- Time elapsed: 0.734 s <<< FAILURE!
org.junit.ComparisonFailure: expected:<[u]> but was:<[f]>
at org.junit.Assert.assertEquals(Assert.java:117)
at org.junit.Assert.assertEquals(Assert.java:146)
at org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle(MovingAverageIterableTest.java:521)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
and
[INFO] Running org.apache.druid.query.movingaverage.MovingAverageIterableTest
[ERROR] Tests run: 4, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 0.763 s <<< FAILURE! -- in org.apache.druid.query.movingaverage.MovingAverageIterableTest
[ERROR] org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEnd -- Time elapsed: 0.691 s <<< FAILURE!
org.junit.ComparisonFailure: expected:<[u]> but was:<[f]>
at org.junit.Assert.assertEquals(Assert.java:117)
at org.junit.Assert.assertEquals(Assert.java:146)
at org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEnd(MovingAverageIterableTest.java:441)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
, respectively.
Should the iteration order change both this test and the test mentioned in the PR above would fail.
According to the observations above, I propose solution to make these tests more robust by using a set comparison for the lines where the test asserts for the presence of the injections such as
Set<String> expectedGenders = new HashSet<>(Arrays.asList("u", "f"));
Set<String> actualGenders = new HashSet<>();
Assert.assertTrue(iter.hasNext());
result = iter.next();
actualGenders.add((result.getDimension("gender")).get(0));
Assert.assertEquals(JAN_2, (result.getTimestamp()));
Assert.assertTrue(iter.hasNext());
result = iter.next();
actualGenders.add((result.getDimension("gender")).get(0));
Assert.assertEquals(JAN_2, (result.getTimestamp()));
Assert.assertEquals(expectedGenders, actualGenders);
This would not reduce the coverage of these tests.
@kfaraz Would you like me to open a PR for this?