Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated floats are not uniformly distributed #30

Open
knutwalker opened this issue Jul 29, 2021 · 4 comments
Open

Generated floats are not uniformly distributed #30

knutwalker opened this issue Jul 29, 2021 · 4 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@knutwalker
Copy link

Hej 👋

The documentation does not say anything about the distribution of the values, though one would assume that primitive values ought to be uniformly distributed. This is not the case for f32 and f64 based on

(u32::random(r) as f32) / (u32::MAX as f32)
and
(u64::random(r) as f64) / (u64::MAX as f64)

Floats above a certain value lose prevision, so using, say, u32::MAX and u32::MAX - 42 as the numerator both yield 1.0, which leads to a bias of the generated floats towards 1.0. This playground illustrates it a bit more: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=465eb2dcf33851195dde9f3a6601ea22

If you want to fix that, a solution would be to use the "safe" value provided by <float>::MANTISSA_DIGITS as the max instead of <int>::MAX. I could ope a PR for that if you like.

@Absolucy
Copy link
Owner

Sure, feel free to open a PR!

@Absolucy Absolucy added bug Something isn't working good first issue Good for newcomers labels Jul 29, 2021
@Absolucy
Copy link
Owner

I'm not too familar how to do this with <float>::MANTISSA_DIGITS. f64::MANTISSA_DIGITS is just... 53. Do you mean something like u64::MAX >> f64::MANTISSA_DIGITS?

@jklott
Copy link
Contributor

jklott commented Feb 4, 2023

I've been looking at this for a bit and did some testing on the matter. You can use this playground to dive into how I was testing: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e46aa8e1706df69ff68fe5805c3779dc

The results are basically that for values between 1 and 127 (inclusive), the current method won't tell the difference between something like (u32::MAX - 0) as f32 / u32::MAX as f32 and (u32::MAX - 127) as f32 / u32::MAX as f32

The change that @knutwalker has talked about shows that it won't tell the difference only for values 1 and 2. That is, (U32_SAFE_MAX - 0) as f32 / U32_SAFE_MAX as f32 and (U32_SAFE_MAX - 2) as f32 / U32_SAFE_MAX as f32 evaluate the same.

So although I'm not sure exactly what change should be implemented to random, or if one is needed at all, but it appears to only happen in the case u32::MAX >= u32::random >= u32::MAX - 127

@jklott
Copy link
Contributor

jklott commented Feb 11, 2023

So although it does seem for 128 different numbers at the tail end of the distribution it blends together, from my testing it seems to have no noticeable effect on the distribution. I used the code below to test multiple times how the distribution ends up and I feel it is sufficient for a non-cryptographically secure random generator. If you have a PR to submit @knutwalker then I think that wouldn't harm anything, but just leaving this comment to say the issue isn't noticeable.

use std::vec;

use nanorand::WyRand;

use nanorand::Rng;

fn main() {
	let mut rng = WyRand::new();
	//println!("Random number: {}", rng.generate::<f64>());

	let mut distributions = vec![0, 0, 0, 0, 0, 0, 0, 0, 0, 0];

	for _ in 0..=99_999_999 {
		let num = rng.generate::<f32>();
		if num <= 0.1 {
			distributions[0] += 1;
		} else if num <= 0.2 {
			distributions[1] += 1;
		} else if num <= 0.3 {
			distributions[2] += 1;
		} else if num <= 0.4 {
			distributions[3] += 1;
		} else if num <= 0.5 {
			distributions[4] += 1;
		} else if num <= 0.6 {
			distributions[5] += 1;
		} else if num <= 0.7 {
			distributions[6] += 1;
		} else if num <= 0.8 {
			distributions[7] += 1;
		} else if num <= 0.9 {
			distributions[8] += 1;
		} else {
			distributions[9] += 1;
		}
	}

	let mut new_num: f64;
	let mut normal_dis: Vec<f64> = vec![];

	for element in distributions {
		new_num = (element as f64) / 99_999_999.0;
		normal_dis.push(new_num);
	}

	println!("{:?}", normal_dis);
}
[0.09994116099941161, 0.10000555100005551, 0.100042101000421, 0.09999625099996251, 0.0999446709994467, 0.10003830100038301, 0.10000667100006672, 0.0999799509997995, 0.1000539110005391, 0.09999144099991442]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants