Support Request - Generating values within a range #12

JeremyLWright · 2015-08-19T16:47:31Z

Good Morning John,

I'm using autocheck, with great success, and I now want to generate values within a range.

I am testing some date functions and I want to generate years between [1970, 2015] inclusively. I tried the ac::fix() combinator, but that just sets an upper limit on the generated value.

I see how I can use discard_if() to throw away values that don't match, but that feels so wasteful. Additionally, I want to generate days, months and years so I'd throw away a lot of data before I generated a valid tuple of values.

How would you recommend generating a range, and if it doesn't exist in autocheck today, what guidance could you provide to how I should add this feature?

Thank you for your time.

Sincerely,
Jeremy Wright

thejohnfreeman · 2015-08-20T01:30:11Z

There's a section in the original SmallCheck paper covering these situations. The general method is to define a custom generator for a new "type" representing the search space based off a bijection with an existing type. In this case you want to generate integers within a finite range. We could imagine it as an algebraic data type with a large number of zero arity constructors. The generator would just iterate from one end of the range to the other. For large ranges and few tests, though, it will give just about the worst possible coverage.

I think a good option is to treat the range as a large list and index into it with the signed integral generator (0 returns the low boundary, -1 returns the high) modulo the size of the range. That way you test the edge cases early but quickly spread out across the range.

JeremyLWright · 2015-09-03T15:35:16Z

Hello John,

I tried elements of your suggestion, but I didn't understand the -1 returns high end of the range. Here's what I have:

autocheck::catch_reporter rep;
    auto ms = boost::irange(0, 60);
    auto m = boost::irange(1, 13);
    auto h = boost::irange(0, 24);
    auto y = boost::irange(1970, 2020);
    auto d = boost::irange(1, 32);
    const std::vector<int> minute_second_range(ms.begin(), ms.end());
    const std::vector<int> month_range(m.begin(), m.end());
    const std::vector<int> hour_range(h.begin(), h.end());
    const std::vector<int> years_range(y.begin(), y.end());
    const std::vector<int> day_range(d.begin(), d.end());

    autocheck::check<
            std::uint8_t,
            std::uint8_t,
            std::uint8_t,
            std::uint8_t,
            std::uint8_t,
            std::uint8_t>(
        [&minute_second_range,
        &hour_range,
        &month_range,
        &years_range,
        &day_range](
            std::uint8_t& year_idx,
            std::uint8_t& month_idx,
            std::uint8_t& day_idx,
            std::uint8_t& hour_idx,
            std::uint8_t& minute_idx,
            std::uint8_t& second_idx)
    {
        //Using a modulus here to map into the desired range, results in a non-uniform distribution.
        //The non-uniformity can be calculated by: 
        // auto range_coverage = range.size()/(2.0^(sizeof(uint8_t)*8))
        // auto range_overlap_percentage = range_coverage - floor(range_coverage)
        // auto range_of_higher_probability = range_overlap_percentage * range.size()
        const auto year{years_range[year_idx % years_range.size()]}; //years [1970 - 1976] more likely.
        const auto month{month_range[month_idx % month_range.size()]}; //months [Jan - April] more likely.
        const auto day{day_range[day_idx % day_range.size()]}; //days [1 - 8] more likely.
        const auto hour{hour_range[hour_idx % hour_range.size()]}; //hours [0-15] more likely.
        const auto minute{minute_second_range[minute_idx % minute_second_range.size()]}; //minutes [0-15] more likely.
        const auto second{minute_second_range[second_idx % minute_second_range.size()]}; //seconds [0-15] more likely.

                // Use the year, month, day ...
        return REQUIRE(true);
    },
        1000,
        autocheck::make_arbitrary<std::uint8_t,std::uint8_t,std::uint8_t,std::uint8_t,std::uint8_t,std::uint8_t>(),
        rep);

Can you recommend any improvements?

thejohnfreeman · 2015-09-04T17:36:59Z

The non-uniformity is inherent to the SmallCheck approach. SmallCheck uses a depth parameter (which is named size in autocheck; should maybe change it) to define a subset of values for a given type, then tests every value in each subset up to a maximum depth, trying to find the "smallest" failing example. The subsets are not guaranteed to be non-overlapping, and often are overlapping, so values common to many subsets will appear many times. autocheck does not test every value for a given depth (which I'm not opposed to changing, by the way), but will still exhibit the same phenomenon. I'm going to put you on code review for a range generator.

See: #12

thejohnfreeman added a commit that referenced this issue Sep 4, 2015

Add range generator

e99d418

See: #12

thejohnfreeman mentioned this issue Sep 4, 2015

Add range generator #13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Request - Generating values within a range #12

Support Request - Generating values within a range #12

JeremyLWright commented Aug 19, 2015

thejohnfreeman commented Aug 20, 2015

JeremyLWright commented Sep 3, 2015

thejohnfreeman commented Sep 4, 2015

Support Request - Generating values within a range #12

Support Request - Generating values within a range #12

Comments

JeremyLWright commented Aug 19, 2015

thejohnfreeman commented Aug 20, 2015

JeremyLWright commented Sep 3, 2015

thejohnfreeman commented Sep 4, 2015