CASSANDRA-20253: Forward port randomized testing improvements from cep-15-accord to trunk #3835

dcapwell · 2025-01-27T20:51:02Z

No description provided.

dcapwell · 2025-01-27T20:51:55Z

src/java/org/apache/cassandra/config/CassandraRelevantProperties.java

@@ -554,7 +554,7 @@ public enum CassandraRelevantProperties
    TCM_USE_NO_OP_REPLICATOR("cassandra.test.use_no_op_replicator", "false"),

    TEST_BBFAILHELPER_ENABLED("test.bbfailhelper.enabled"),
-    TEST_BLOB_SHARED_SEED("cassandra.test.blob.shared.seed"),
+    TEST_BLOB_SHARED_SEED("cassandra.test.blob.shared.seed", "42"),


current logic used current time, which meant that tests were non-deterministic.... the seed must be set in order to be able to reproduce issues

Not sure I follow. Is the problem that we just didn't output the seed in whatever failure message was produced?

so, the tests use qt() and controls the seed for the test. The blob logic has a giant blob that is shared by all tests, and uses a seed to populate the bytes... this seed is decoupled from the test seed, and would use currentTimeMillis; this caused the tests to be non-deterministic and not repeatable when errors were found.

Yeah, not something I'd expect to unify in this patch.

dcapwell · 2025-01-27T20:53:22Z

test/distributed/org/apache/cassandra/fuzz/snapshots/SnapshotsTest.java

@@ -132,21 +132,21 @@ private Pair<SnapshotsHolder, Map<String, String>> generateParams()
                    case 1:
                        if (!state.truncatedSnapshots.isEmpty())
                        {
-                            keyspace = state.rs.pick(state.truncatedSnapshots.keySet()).split("\\.")[0];
+                            keyspace = state.rs.pickUnorderedSet(state.truncatedSnapshots.keySet()).split("\\.")[0];


in pulling all the RandomSource in this method got renamed to be 2 different methods

pickUnorderedSet and pickOrderedSet... unorded needs the sub type to be Comparable and will do a sort so the result is deterministic... pickOrderedSet doesn't sort as the set order should be deterministic already... this test uses HashMap which is non-determiistic so we need pickUnordedSet

dcapwell · 2025-01-27T20:54:23Z

test/unit/accord/utils/Gens.java

+    }
+
+    /**
+     * This is a change from accord as that uses {@link accord.utils.Utils#reverse}, which doesn't exist in this forward port.


this is the only change... i didn't want to bring in Utils and add more conflict with cep-15-accord... so in-lined... when accord merges this whole file will go away...

dcapwell · 2025-01-27T20:56:57Z

test/unit/org/apache/cassandra/utils/CassandraGenerators.java

+        }
+    }
+
+    public static class TestableNetworkTopologyStrategy extends NetworkTopologyStrategy


NetworkTopologyStrategy validates the datacenters match a config, but that makes it a bit harder to use in some contexts... so this one avoids validating datacenters!

dcapwell · 2025-01-27T20:58:51Z

test/unit/org/apache/cassandra/utils/CassandraGenerators.java

+     * allowed in a primary key.  If a test needs to filter out those cases, can just
+     * {@code .map(CassandraGenerators::simplify)} to resolve.
+     */
+    public static IPartitioner simplify(IPartitioner partitioner)


in a follow up patch ill prob remove this... LocalPartitioner has caused flakyness for several tests in cep-15-accord so plan to refactor that logic to be less problematic, which lets us drop this function... keeping for now just to be in-sync

dcapwell · 2025-01-27T21:00:12Z

test/unit/org/apache/cassandra/utils/Generators.java

@@ -466,7 +535,7 @@ private static final class LazySharedBlob

        static
        {
-            long blobSeed = TEST_BLOB_SHARED_SEED.getLong(System.currentTimeMillis());
+            long blobSeed = TEST_BLOB_SHARED_SEED.getLong();


this default was never correct, as it made tests non-reproducable

src/java/org/apache/cassandra/schema/ReplicationParams.java

maedhroz · 2025-01-28T17:02:15Z

test/unit/accord/utils/random/Picker.java

+ * limitations under the License.
+ */
+
+package accord.utils.random;


The whole general accord.utils package naming is something I suppose we should address in the long term if the things in it are really not Accord-specific. No sense in changing before cep-15-accord merges though, of course.

i think there are a few things...

we have several testing utils that should have a different life cycle than accord, so likely should be moved out for better isolation

when cep-15-accord rebases this forward part has caused issues... the cep-15-accord history hasn't been squashed so all changes to these packages exist in cep-15-accord history, causing conflicts on rebase to trunk... to avoid this i added a commit on top of trunk last time i did the rebase to rename this package; i then removed after the rebase completed...

maedhroz · 2025-01-28T17:06:35Z

test/unit/accord/utils/random/Picker.java

+{
+    public static float[] randomWeights(RandomSource random, int length)
+    {
+        float[] weights = new float[length - 1];


If we have a length of 1, is the method supposed to just return an empty array?

guess so... most of the higher up APIs that end up calling this check for 0/1 and optimize around those (or block)...

maedhroz · 2025-01-28T17:10:18Z

test/unit/accord/utils/random/Picker.java

+
+        static float[] randomWeights(RandomSource random, float[] bias)
+        {
+            float[] weights = new float[bias.length - 1];


Do we ignore one of the bias floats?

accord.utils.random.Picker.Weighted#pickIndex

so, we use binary search to find what weight to select, and if the selected float is larger than max weight, we choose the last offset in bias i believe.

Yeah, like these methods don't explode, I guess, because they use the lengths of the newly created arrays in their loop conditions, but it's a little confusing that we arbitrarily memory hole the last bias in the context of this method itself.

its very leaky... you must use it "correctly" and only once place in the code knows what that means...

maedhroz · 2025-01-28T17:13:20Z

test/unit/accord/utils/Gens.java

@@ -88,6 +91,63 @@ public static <T> Gen<T> oneOf(Map<Gen<T>, Integer> values)
        return rs -> gen.next(rs).next(rs);
    }

+    public static <T> OneOfBuilder<T> oneOf()


Generally not going to make comments about unused items, given I don't know what bits will be used later in some capacity.

its only used in org.apache.cassandra.tcm.log.LogStateTestBase on cep-15-accord, but not in trunk.

I am asking Alex about this, as the usage is for TCM reconstruct of older epochs. This feature was added specifically for Accord, but we had a lot of issues with it so we migrated off it... but seems we still have the API/Tests for it.

maedhroz · 2025-01-28T17:59:40Z

test/unit/accord/utils/Gens.java

+                    throw new AssertionError();
+            }
+        };
+    }


There are some things in this class that we might able to slim down with some tweaking of types. This method might be able to go away with this patch, for instance:

Subject: [PATCH] DRY up mixedDistribution() --- Index: test/unit/accord/utils/Gens.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== diff --git a/test/unit/accord/utils/Gens.java b/test/unit/accord/utils/Gens.java --- a/test/unit/accord/utils/Gens.java (revision 7bc06a322c211f52fbb6df60f2058a32cd57dafc) +++ b/test/unit/accord/utils/Gens.java (date 1738086893079) @@ -331,15 +331,16 @@ throw new IllegalArgumentException("Range is too large; min=" + minInclusive + ", max=" + maxExclusive); if (numBuckets <= 0 || numBuckets > domainSize) throw new IllegalArgumentException("Num buckets must be between 1 and " + domainSize + "; given " + numBuckets); - int[] bucket, indexes; - bucket = new int[numBuckets]; + Integer[] bucket, indexes; + bucket = new Integer[numBuckets]; int delta = domainSize / numBuckets; for (int i = 0; i < numBuckets; i++) bucket[i] = minInclusive + i * delta; - indexes = IntStream.range(0, bucket.length).toArray(); - Gen<Gen.IntGen> indexDistro = mixedDistribution(indexes); + + indexes = (Integer[]) IntStream.range(0, bucket.length).boxed().toArray(); + Gen<Gen<Integer>> indexDistro = mixedDistribution(indexes); return rs -> { - Gen.IntGen indexGen = indexDistro.next(rs); + Gen<Integer> indexGen = indexDistro.next(rs); switch (rs.nextInt(0, 2)) { case 0: // uniform @@ -530,30 +531,9 @@ }; } - public static <T> Gen<Gen.IntGen> mixedDistribution(int[] list) + public static Gen<Gen<Integer>> mixedDistribution(Integer[] list) { - return rs -> { - switch (rs.nextInt(0, 4)) - { - case 0: // uniform - return r -> list[rs.nextInt(0, list.length)]; - case 1: // median biased - int median = rs.nextInt(0, list.length); - return r -> list[r.nextBiasedInt(0, median, list.length)]; - case 2: // zipf - int[] array = list; - if (rs.nextBoolean()) - { - array = Arrays.copyOf(array, array.length); - reverse(array); - } - return pickZipf(array); - case 3: // random weight - return randomWeights(list).next(rs); - default: - throw new AssertionError(); - } - }; + return mixedDistribution(Arrays.asList(list)); } /**

we have a pattern of generic and primitive versions, this helps avoid extra allocations. Given that this change would force boxing I would prefer not to unify.

maedhroz · 2025-01-28T18:10:29Z

test/unit/accord/utils/Property.java

@@ -555,7 +555,7 @@ default void checkPostconditions(State state, Result expected,
        default void process(State state, SystemUnderTest sut) throws Throwable
        {
            checkPostconditions(state, apply(state),
-                    sut, run(sut));
+                                sut, run(sut));


nit: Might as well one-liner this one if we're going to reformat

this is a personal style choice. state, apply(state) is everything going on for the state, and sut, run(sut) is the same for sut... i felt this was easier to read as 2 lines rather than 1.

If you do not agree, i can normalize by making one line; just giving my argument

maedhroz · 2025-01-28T18:14:59Z

test/unit/accord/utils/random/Picker.java

+                weights[i] = sum += random.nextFloat() * bias[i];
+            sum += random.nextFloat() * bias[weights.length];
+            for (int i = 0 ; i < weights.length ; ++i)
+                weights[i] /= sum;


nit: Also, this method seems to normalize the random weights, but that's not reflected in the name.

…p-15-accord to trunk

dcapwell commented Jan 27, 2025

View reviewed changes

dcapwell requested a review from maedhroz January 27, 2025 21:30

maedhroz reviewed Jan 28, 2025

View reviewed changes

src/java/org/apache/cassandra/schema/ReplicationParams.java Outdated Show resolved Hide resolved

maedhroz reviewed Jan 28, 2025

View reviewed changes

maedhroz approved these changes Jan 28, 2025

View reviewed changes

dcapwell added 2 commits January 28, 2025 10:45

CASSANDRA-20253: Forward port randomized testing improvements from ce…

d93e3c0

…p-15-accord to trunk

feedback

188d42a

dcapwell force-pushed the CASSANDRA-20253 branch from edc2c78 to 188d42a Compare January 28, 2025 18:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CASSANDRA-20253: Forward port randomized testing improvements from cep-15-accord to trunk #3835

CASSANDRA-20253: Forward port randomized testing improvements from cep-15-accord to trunk #3835

dcapwell commented Jan 27, 2025

dcapwell Jan 27, 2025

maedhroz Jan 28, 2025

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

dcapwell Jan 27, 2025

dcapwell Jan 27, 2025

dcapwell Jan 27, 2025

dcapwell Jan 27, 2025

dcapwell Jan 27, 2025

maedhroz Jan 28, 2025 •

edited

Loading

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

dcapwell Jan 28, 2025

maedhroz Jan 28, 2025

CASSANDRA-20253: Forward port randomized testing improvements from cep-15-accord to trunk #3835

Are you sure you want to change the base?

CASSANDRA-20253: Forward port randomized testing improvements from cep-15-accord to trunk #3835

Conversation

dcapwell commented Jan 27, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maedhroz Jan 28, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maedhroz Jan 28, 2025 •

edited

Loading