Interface to performance #1

lehins · 2020-02-12T04:30:50Z

Proposal

@idontgetoutmuch, @cartazio and @curiousleo since you guys asked me about this, here is what I suggest.

CC @ekmett (you were last one to release random) and @phadej (you have the best candidate for default pure random number generator for Haskell)

These are the initial changes I propose to the random package. At a high level it contains:

ability to provide efficient generation of all major types: Word8 .. Double, which are further used by Random class. (Method names are self explanatory: nextWord8, nextWord16 ...)
Addition of RandomPrimGen and extra two methods to Random class: randomPrim and randomPrimR, which would allow us to use the same interface for stateful random number generators like mwc-random and pcg-random
also added bitmaskWithRejection which @phadej kindly mentioned on reddit and implemented in splitmix

Benefits:

What are the benefits of this approach:

It is backwards compatible and breakage at the API level is non-existent. Values generated will be different for the same generator as before, therefore it probably will have to be considered as a breaking change. But as far as users of random are concerned, nothing will have to be done on their part to benefit from this, which is great, because that is a lot of code!
Enormous performance improvement. Just consider how much energy we can save in the world by making CI for all those Haskell projects using QuickCheck run faster.
As mentioned earlier, addition of RandomPrimGen will allow us to use unified API for stateful and pure RNGs
Current RNG libraries can be updated at will, since they will not break with this change

Drawbacks

All libraries that implement RNGs: splitmix, mersenne-random-pure64, mwc-random, pcg-random, etc. will require a small non-breaking change added to them. For pure generators (if they want performance) implementation of new Random class functions, while for stateful addition of instances for RandomPrimGen class. Important part is: none of them will break if they don't receive any changes.
The work that @cartazio have done so far probably will not be compatible

Performance

A bit of comparison on why this is important. Here is a link with current performance of

random :: (Random a, RandomGen g) => g -> (a, g)

restricted to a :: Word64 for all libraries that have RandomGen instance: https://alexey.kuleshevi.ch/assets/iframes/2019-12-21-random-benchmarks/random64.html
and here is what it can be if we go the route I suggest. The only library I changed to use the native generator for Word64 was splitmix (64bit) version:

Here is what I had to do to splitmix package to be able to achieve such affect: lehins/splitmix@928b9a1

TODO:

Functions for generation of ranges in random still needs lots of fixing, I just scratched the surface.
StdGen needs replacement. It is total crap. It's old, time to get something better and faster. I nominate splitmix
If we choose splitmix, I'd recommend just dropping splitmix dependency on random and invert it: make random depend on splitmix. Add type StdGen = SMGen to random and call it a day.
Both stateful and pure RNGs libraries will need to have PRs sent their way. All of which should be simple, all of them contain the hard bits.

References

Original blogpost that identified the issue: https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks/
Issue that initiated this discussion: Make SplitMix the Default Random Number Generator random-playground#1
Reddit with some discussions: https://www.reddit.com/r/haskell/comments/edr9n4/random_benchmarks/

cartazio · 2020-02-12T05:03:36Z

Ok. Thanks! Im going to dig into this later this week

…

On Tue, Feb 11, 2020 at 11:30 PM Alexey Kuleshevich < ***@***.***> wrote: Proposal @idontgetoutmuch <https://github.com/idontgetoutmuch>, @cartazio <https://github.com/cartazio> and @curiousleo <https://github.com/curiousleo> since you guys asked me about this, here is what I suggest. CC @ekmett <https://github.com/ekmett> (you were last one to release random) and @phadej <https://github.com/phadej> (you have the best candidate for default pure random number generator for Haskell) These are the initial changes I propose to the random package. At a high level it contains: - ability to provide efficient generation of all major types: Word8 .. Double, which are further used by Random class. (Method names are self explanatory: nextWord8, nextWord16 ...) - Addition of RandomPrimGen and extra two methods to Random class: randomPrim and randomPrimR, which would allow us to use the same interface for stateful random number generators like mwc-random and pcg-random - also added bitmaskWithRejection which @phadej <https://github.com/phadej> kindly mentioned on reddit and implemented in splitmix Benefits: What are the benefits of this approach: - It is backwards compatible and breakage at the API level is non-existent. Values generated will be different for the same generator as before, therefore it probably will have to be considered as a breaking change. But as far as users of random are concerned, nothing will have to be done on their part to benefit from this, which is great, because that is a lot of code! - Enormous performance improvement. Just consider how much energy we can save in the world by making CI for all those Haskell projects using QuickCheck run faster. - As mentioned earlier, addition of RandomPrimGen will allow us to use unified API for stateful and pure RNGs - Current RNG libraries can be updated at will, since they will not break with this change Drawbacks - All libraries that implement RNGs: splitmix, mersenne-random-pure64, mwc-random, pcg-random, etc. will require a small non-breaking change added to them. For pure generators (if they want performance) implementation of new Random class functions, while for stateful addition of instances for RandomPrimGen class. Important part is: none of them will break if they don't receive any changes. - The work that @cartazio <https://github.com/cartazio> have done so far probably will not be compatible Performance A bit of comparison on why this is important. Here is a link with current performance of random :: (Random a, RandomGen g) => g -> (a, g) restricted to a :: Word64 for all libraries that have RandomGen instance: https://alexey.kuleshevi.ch/assets/iframes/2019-12-21-random-benchmarks/random64.html and here is what it can be if we go the route I suggest. The only library I changed to use the native generator for Word64 was splitmix (64bit) version: [image: FireShot Capture 025 - criterion report -] <https://user-images.githubusercontent.com/2333894/74302315-2e2ea300-4d67-11ea-88f7-2aacdab3f1cd.png> Here is what I had to do to splitmix package to be able to achieve such affect: ***@***.*** <lehins/splitmix@928b9a1> TODO: - Functions for generation of ranges in random still needs lots of fixing, I just scratched the surface. - StdGen needs replacement. It is total crap. It's old, time to get something better and faster. I nominate splitmix - If we choose splitmix, I'd recommend just dropping splitmix dependency on random and invert it: make random depend on splitmix. Add type StdGen = SMGen to random and call it a day. - Both stateful and pure RNGs libraries will need to have PRs sent their way. All of which should be simple, all of them contain the hard bits. References - Original blogpost that identified the issue: https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks/ - Issue that initiated this discussion: idontgetoutmuch/random-playground#1 <idontgetoutmuch/random-playground#1> - Reddit with some discussions: https://www.reddit.com/r/haskell/comments/edr9n4/random_benchmarks/ ------------------------------ You can view, comment on, or merge this pull request online at: #1 Commit Summary - Allow RNGs to provide efficient implementations for variety of prim types - Introduce PrimMonad interface. Add range generation used in splitmix - Export all class contents File Changes - *M* System/Random.hs <https://github.com/lehins/random/pull/1/files#diff-0> (295) - *M* random.cabal <https://github.com/lehins/random/pull/1/files#diff-1> (5) Patch Links: - https://github.com/lehins/random/pull/1.patch - https://github.com/lehins/random/pull/1.diff — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQQXQDYUGD7O6UNPEM3RCN3PZANCNFSM4KTSOESA> .

idontgetoutmuch · 2020-02-12T15:29:29Z

@curiousleo and I have just tested random, mwc and splitmix via smallcrush and they all pass.

Code here if you are interested in reproducing: https://github.com/idontgetoutmuch/random-playground/blob/splitmix-replacement/fromStdin.c

We are currently testing via PractRand also. We'll post the results when they are available.

See also idontgetoutmuch/random-playground#1 (comment). @curiousleo is implementing @peteroupc's suggestions.

cartazio · 2020-02-12T15:41:25Z

Several fail big crush. I’ll dig up the old code next week or so. Lots of work and personal commitments this week/month to juggle.

…

On Wed, Feb 12, 2020 at 10:29 AM idontgetoutmuch ***@***.***> wrote: @curiousleo <https://github.com/curiousleo> and I have just tested random, mwc and splitmix via smallcrush and they all pass. Code here if you are interested in reproducing: https://github.com/idontgetoutmuch/random-playground/blob/splitmix-replacement/fromStdin.c — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQU2HVLKOYG46BEACDLRCQIVVANCNFSM4KTSOESA> .

idontgetoutmuch

import System.Random
import System.Random.MWC

instance RandomPrimGen Gen where
  genWord8 = uniform
  genWord16 = uniform
  genWord32 = uniform
  genWord64 = uniform
  genFloat = uniform
  genDouble = uniform

but I am not sure about performance.

I am not sure where to write this but should we have PRs like the instance above tested together and ready to update all (or at least popular) random number generator packages?

cartazio · 2020-02-14T13:29:46Z

I have some concerns too. I’ll share my design / examples later today Those sizes as primitive ops for the pure ones won’t actually work I think ... at least with some of those apis. I’ll write out my notes and such post haste.

…

On Fri, Feb 14, 2020 at 8:22 AM idontgetoutmuch ***@***.***> wrote: ***@***.**** commented on this pull request. import System.Random import System.Random.MWC instance RandomPrimGen Gen where genWord8 = uniform genWord16 = uniform genWord32 = uniform genWord64 = uniform genFloat = uniform genDouble = uniform but I am not sure about performance. I am not sure where to write this but should we have PRs like the instance above tested together and ready to update all (or at least popular) random number generator packages? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQWWU5XLJQFE7RU6TILRC2LJVANCNFSM4KTSOESA> .

lehins · 2020-02-14T14:28:45Z

@idontgetoutmuch That's right, we'd submit PRs to mwc-random and pcg-random with such instances:

I am not sure where to write this but should we have PRs like the instance above tested together and ready to update all (or at least popular) random number generator packages?

@Shimuuar what's you take on such interface and would you be opposed to having an interface like the one suggested in this RandomPrimGen for stateful RNGs? If you don't think a PR with instance RandomPrimGen Gen where could get merged into mwc-random then there would be really no point in adding such class to random package either.

@idontgetoutmuch I don't quite understand your concern here, maybe cause I don't see which part could affect the performance in a negative way.

I am not sure about performance.

@cartazio Could you elaborate on this:

Those sizes as primitive ops for the pure ones won’t actually work I think

do you mean that not all packages provide all primitives (eg, nextWord16). If that is the case, then it doesn't matter, since all of next* functions have default implementations that are base on a function that is provided by all APIs, namely next. In fact if you were to merge a PR like it is right now, none of the packages that provide instance for RandomGen would break!

cartazio · 2020-02-14T15:45:32Z

Could you ramp down the pace, theres actually an orthogonal design I have in mind I can crank out over the next few days that doesn’t need the same constraints but should work quite nicely. Also I’ve not agreed that this is the design for next random yet. More fundamentally: there does NEED to be a breaking change in random for a pretty simple reason: the current api does not make it easy to write sampling programs that behave the same on different platforms given the same seed when using non platform independent algorithms on top. This , in my mind, is table stakes for what a good major version bump should deliver. Along with a few other issues. Providing a good bridge tool for spanning that breakage is / was honestly the challenge that got me into an anxious spiral that delayed releasing and iterating on that for too long. Happily it’s at the top of my list atm On Fri, Feb 14, 2020 at 9:28 AM Alexey Kuleshevich <[email protected]<mailto:[email protected]>> wrote: @idontgetoutmuch<https://github.com/idontgetoutmuch> That's right, we'd submit PRs tomwc-random and pcg-random with such instances: I am not sure where to write this but should we have PRs like the instance above tested together and ready to update all (or at least popular) random number generator packages? @Shimuuar<https://github.com/Shimuuar> what's you take on such interface and would you be opposed to having an interface like the one suggested in thisRandomPrimGen for stateful RNGs? If you don't think a PR with instance RandomPrimGen Gen where could get merged into mwc-random then there would be really no point in adding such class torandom package either. @idontgetoutmuch<https://github.com/idontgetoutmuch> I don't quite understand your concern here, maybe cause I don't see which part could affect the performance in a negative way. I am not sure about performance. @cartazio<https://github.com/cartazio> Could you elaborate on this: Those sizes as primitive ops for the pure ones won’t actually work I think do you mean that not all packages provide all primitives (eg, nextWord16). If that is the case, then it doesn't matter, since all ofnext* functions have default implementations that are base on a function that is provided by all APIs, namelynext. In fact if you were to merge a PR like it is right now, none of the packages that provide instance forRandomGen would break! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#1>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAABBQT5MDLFBEIMHMEISYTRC2TB3ANCNFSM4KTSOESA>.

lehins · 2020-02-14T16:21:39Z

@cartazio Improving random was not something I wanted to do, in fact I though that this was a solved problem and we can all move on with our lives writing programs that occasionally need some randomness. My blogpost with benchmarks stirred things up a bit and it seems like there is strong desire in the community to get this thing resolved.

Could you ramp down the pace

I am in no rush at all. I just prefer to get things done instead of having discussions that drag for years.

there does NEED to be a breaking change in random

I could care less if there is breakage or not, I don't personally use random, but it is something that the rest of the community probably will care about, since there is quite a bit of code that depends on it.

Also I’ve not agreed that this is the design for next random yet.

That is why the title of this ticket is "Proposal". Looking forward to seeing your alternative approach.

the current api does not make it easy to write sampling programs that behave the same on different platforms given the same seed when using non platform independent algorithms on top

That sounds interesting. You probably have some examples of this laying around, could you please share some code snippets or a link that has 'em?

Along with a few other issues.

It sounds like you have this sort it out, but if you don't mind, could you please list those issues, so everyone has a clear idea what the problem/solution set is.

Providing a good bridge tool for spanning that breakage is / was honestly the challenge that got me into an anxious spiral that delayed releasing and iterating on that for too long.

You don't need to do it all by yourself. There are plenty of people who are willing to help.

Shimuuar · 2020-02-14T17:18:33Z

@Shimuuar https://github.com/Shimuuar what's you take on such interface and would you be opposed to having an interface like the one suggested in thisRandomPrimGen for stateful RNGs? If you don't think a PR with instance RandomPrimGen Gen where could get merged into mwc-random then there would be really no point in adding such class torandom package either.

Generally I think current state of PRNGs is deeply unsatisfactory. Each PRNG library has its own interface and only implements fraction of generators and it's impossible to reuse code. Ideally PRNGs should just implement interface and reuse all derived generators (generate numbers in range, What is general idea? Is it to modify random. Is it to create separate library?

Offhand I can say that I think Float & Double shouldn't be part of interface. They are trivially derived from uniform Word32/Word64. And there's problem: are we generating in [0,1] range? (0,1]?

I also played wit generic API for PRNGs and will try to find what did I wrote over weekends.

idontgetoutmuch · 2020-02-14T17:40:31Z

Is it to modify random. Is it to create separate library?

Modify random

idontgetoutmuch · 2020-02-15T13:32:28Z

Generally I think current state of PRNGs is deeply unsatisfactory. Each PRNG library has its own interface and only implements fraction of generators and it's impossible to reuse code. Ideally PRNGs should just implement interface and reuse all derived generators (generate numbers in range, What is general idea? Is it to modify random. Is it to create separate library?

Offhand I can say that I think Float & Double shouldn't be part of interface. They are trivially derived from uniform Word32/Word64. And there's problem: are we generating in [0,1] range? (0,1]?

I also played wit generic API for PRNGs and will try to find what did I wrote over weekends.

@Shimuuar the idea is to do something like this:

import System.Random
import qualified System.Random.MWC as MWC
import qualified System.Random.PCG as PCG
import qualified System.Random.SFMT as SFMT

instance RandomPrimGen PCG.Gen where
  genWord8 = PCG.uniform
  genWord16 = PCG.uniform
  genWord32 = PCG.uniform
  genWord64 = PCG.uniform
  genFloat = PCG.uniform
  genDouble = PCG.uniform

instance RandomPrimGen SFMT.Gen where
  genWord8 = SFMT.uniform
  genWord16 = SFMT.uniform
  genWord32 = SFMT.uniform
  genWord64 = SFMT.uniform
  genFloat = SFMT.uniform
  genDouble = SFMT.uniform

and then you have the same interface for every stateful RNG:

*Main> PCG.create >>= genDouble
0.582712623703629
*Main> MWC.create >>= genDouble
2.481036288296201e-2
*Main> SFMT.create >>= genDouble
6.1754467473337606e-2

However, I've noticed all 3 of the RNGs define a class Variate

Are you suggesting we should put the class definition in random and then just create instances of it (as those 3 packages do) rather than have the class RandomGen? That gives the same interface for all 3 RNGs and doesn't change the interface at all.

@lehins anything you want to add?

Shimuuar · 2020-02-15T16:29:37Z

API

I think API is not quite right. Here is my take on API. PRNGs generate uniformly distributed number in some range. Sometimes it's full range of Word32 (mwc-random), sometimes it's not (most of LCGs). For latter generation of uniformly distributed Word32/Word64 is not a primitive operation and is quite complicated! Here is API that should I think accommodate both:

class Monad m => MonadRandom m where
  -- | Generate uniformly distributed 32-bit word
  uniformWord32    :: m Word32
  -- | Generate uniformly distributed 64-bit word
  uniformWord64    :: m Word64
  -- | Generate uniformly distributed 32-bit word in range [0,n]
  uniformRWord32   :: Word32 -> m Word32
  -- | Generate uniformly distributed 32-bit word in range [0,n]
  uniformRWord64   :: Word64 -> m Word64

Primitives for Word8/16 are absent. Those could be derived from uniformRWord32 and I don't think there're any generators of practical use that use Word8/16 internally
uniformRWord32/64 is a primitive because if generator does not generate full range of Word32 implementation in terms of uniformWord32 is not efficient
Primitives for Float/Double are absent. As I already said definition it very simple and there're considerations whether to include 0 into range or not.

wordToFloatZ :: Word32 -> Float
wordToFloatZ x = (fromIntegral i * f_inv_32) + 0.5
  where
    i = fromIntegral x :: Int32

Another consideration is handling of PRNG's state (save/restore) and initialization. We need to provide support for both because it wouldn't be possible or at least not easy to work with different PRNGs since you'll have to switch not only PRNG but full initialization code

Also some PRNG's also support more than just generating stateful stream of random numbers. AFAIR PCG support generation of many independent streams so do counting PRNGs (random123). Do we need to accomodate such generators?

Variate class

I think this type class is sort of obvious. But it's also wrong. I strongly suspect that it was introduced in mwc-random and later copied by other two. What's wrong with it? uniform & uniformR belong to different type classes.

uniform says: generate all possible values with equal probability. (Word8,Word8) clearly admits instance but what about uniformR? There's no good definition
uniformR says generate all possible values with equal probability. Integer works just fine with this. But uniform? No way! There's infinite number of integers

This is bad design and should be abandoned.

P.S.

All in all I think we already have API that doesn't work well. We have sort-of-standard random which shouldn't be used. Lets try to do things right this time. It's true design space is very complicated here. We have PRNGs that could be implemented as pure functions, we have ones that require in place mutations!

cartazio · 2020-02-15T16:43:29Z

There’s several gotchas with those algorithms for float and double. The usual unit interval used in extent Haskell Libs is wrong. See the commented out bit linked herein https://github.com/haskell/random/blob/master/src/Data/Distribution/FloatingInterval.hs

…

On Sat, Feb 15, 2020 at 11:29 AM Aleksey Khudyakov ***@***.***> wrote: API I think API is not quite right. Here is my take on API. PRNGs generate uniformly distributed number in some range. Sometimes it's full range of Word32 (mwc-random), sometimes it's not (most of LCGs). For latter generation of uniformly distributed Word32/Word64 is not a primitive operation and is quite complicated! Here is API that should I think accommodate both: class Monad m => MonadRandom m where -- | Generate uniformly distributed 32-bit word uniformWord32 :: m Word32 -- | Generate uniformly distributed 64-bit word uniformWord64 :: m Word64 -- | Generate uniformly distributed 32-bit word in range [0,n] uniformRWord32 :: Word32 -> m Word32 -- | Generate uniformly distributed 32-bit word in range [0,n] uniformRWord64 :: Word64 -> m Word64 - Primitives for Word8/16 are absent. Those could be derived from uniformRWord32 and I don't think there're any generators of practical use that use Word8/16 internally - uniformRWord32/64 is a primitive because if generator does not generate full range of Word32 implementation in terms of uniformWord32 is not efficient - Primitives for Float/Double are absent. As I already said definition it very simple and there're considerations whether to include 0 into range or not. wordToFloatZ :: Word32 -> Float wordToFloatZ x = (fromIntegral i * f_inv_32) + 0.5 where i = fromIntegral x :: Int32 Another consideration is handling of PRNG's state (save/restore) and initialization. We need to provide support for both because it wouldn't be possible or at least not easy to work with different PRNGs since you'll have to switch not only PRNG but full initialization code Also some PRNG's also support more than just generating stateful stream of random numbers. AFAIR PCG support generation of many independent streams so do counting PRNGs (random123). Do we need to accomodate such generators? Variate class I think this type class is sort of obvious. But it's also wrong. I strongly suspect that it was introduced in mwc-random and later copied by other two. What's wrong with it? uniform & uniformR belong to different type classes. - uniform says: generate all possible values with equal probability. (Word8,Word8) clearly admits instance but what about uniformR? There's no good definition - uniformR says generate all possible values with equal probability. Integer works just fine with this. But uniform? No way! There's infinite number of integers This is bad design and should be abandoned. P.S. All in all I think we already have API that doesn't work well. We have sort-of-standard random which shouldn't be used. Lets try to do things right this time. It's true design space is very complicated here. We have PRNGs that could be implemented as pure functions, we have ones that require in place mutations! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1?email_source=notifications&email_token=AAABBQX2GYZC22HFG2UOAALRDAJ7FA5CNFSM4KTSOESKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3QY4I#issuecomment-586615921>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQQXVQUMTVXUYL2FXQTRDAJ7FANCNFSM4KTSOESA> .

cartazio · 2020-02-15T16:51:13Z

https://github.com/haskell/random/blob/master/src/Data/Distribution/FloatingInterval.hs

…

On Sat, Feb 15, 2020 at 8:32 AM idontgetoutmuch ***@***.***> wrote: Generally I think current state of PRNGs is deeply unsatisfactory. Each PRNG library has its own interface and only implements fraction of generators and it's impossible to reuse code. Ideally PRNGs should just implement interface and reuse all derived generators (generate numbers in range, What is general idea? Is it to modify random. Is it to create separate library? Offhand I can say that I think Float & Double shouldn't be part of interface. They are trivially derived from uniform Word32/Word64. And there's problem: are we generating in [0,1] range? (0,1]? I also played wit generic API for PRNGs and will try to find what did I wrote over weekends. @Shimuuar <https://github.com/Shimuuar> the idea is to do something like this: import System.Random import qualified System.Random.MWC as MWC import qualified System.Random.PCG as PCG import qualified System.Random.SFMT as SFMT instance RandomPrimGen PCG.Gen where genWord8 = PCG.uniform genWord16 = PCG.uniform genWord32 = PCG.uniform genWord64 = PCG.uniform genFloat = PCG.uniform genDouble = PCG.uniform instance RandomPrimGen SFMT.Gen where genWord8 = SFMT.uniform genWord16 = SFMT.uniform genWord32 = SFMT.uniform genWord64 = SFMT.uniform genFloat = SFMT.uniform genDouble = SFMT.uniform and then you have the same interface for every stateful RNG: *Main> PCG.create >>= genDouble 0.582712623703629 *Main> MWC.create >>= genDouble 2.481036288296201e-2 *Main> SFMT.create >>= genDouble 6.1754467473337606e-2 However, I've noticed all 3 of the RNGs define a class Variate - MWC <http://hackage.haskell.org/package/mwc-random-0.14.0.0/docs/System-Random-MWC.html#g:3> - SFMT <https://hackage.haskell.org/package/sfmt-0.1.1/docs/System-Random-SFMT.html#g:3> - PCG <https://hackage.haskell.org/package/pcg-random-0.1.3.6/docs/System-Random-PCG.html#g:2> Are you suggesting we should put the class definition in random and then just create instances of it (as those 3 packages do) rather than have the class RandomGen? That gives the same interface for all 3 RNGs and doesn't change the interface at all. @lehins <https://github.com/lehins> anything you want to add? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQXMM6ZJI6DJIQ5NQBTRC7VGZANCNFSM4KTSOESA> .

Shimuuar · 2020-02-15T16:52:54Z

I think it only strengthens argument for not including floating point numbers into base API. It's not possible to include all subtle variations and it's not reasonable to expect that all instances will implement them identically

lehins · 2020-02-15T16:56:12Z

I do not agree with this one:

Primitives for Word8/16 are absent.

We should allow a specific RNG to decide what is the most efficient way to generate 8 and 16 bits of random data. Example StdGen, it can't generate full 32 bits in one go. That being said they can all have default implementations, which means supplying either genWord32 or genWord64 will be sufficient

I don't understand the point of these: uniformRWord32 or uniformRWord64 We can decide on the best known approach on generating subranges, for example bitmaskWithRejection

With regards to floating point numbers I am leaning towards not including them for consistency with ranges. Having a single efficient implementation for all RNGs that use either 32bits or 64bits seems like a better idea

Shimuuar · 2020-02-15T17:16:18Z

It's for cases when generating full Word32 is not a primitive operation. LGCs for example generate numbers in range [0,p-1] where p is constant. (2^31-1 is common choice). Thus definition of unimformWord32 is quite complicated, require multiple PRNG iterations and possibly rejections therefore implementing uniformRWord32 in terms uniformWord32 will be inherently inefficient,

On one hand yes. No one in the right mind will use LGC. On other I think it sets precedent that it''s possible that generator produces something else than uniform Word32/64. And generic API that could accomodate such generators is more future proof and less likely to run into problems with some weird PRNG. In some sense uniformRWord32/64 are more fundamental. They ask to generate N distinct possibilities and uniformWord are just special cases which could be implemented more efficiently for some PRNGs

Even for PNRGs that generate full range of Word32/Word64 optimal implementation of uniformRWord32 is different different depending on width of generators' primitive. If generator uses Word32 underneath we should defined uniformRWord32 in terms of uniformWord32. It it uses Word64 we should use uniformWord64 instead

cartazio · 2020-02-15T18:17:22Z

If the monadic interfaces, such as the combinator examples I linked, become the Norm, I think a lot of this complexity goes away.

…

On Sat, Feb 15, 2020 at 12:16 PM Aleksey Khudyakov ***@***.***> wrote: It's for cases when generating full Word32 is not a primitive operation. LGCs for example generate numbers in range [0,p-1] where p is constant. (2^31-1 is common choice). Thus definition of unimformWord32 is quite complicated, require multiple PRNG iterations and possibly rejections therefore implementing uniformRWord32 in terms uniformWord32 will be inherently inefficient, On one hand yes. No one in the right mind will use LGC. On other I think it sets precedent that it''s possible that generator produces something else than uniform Word32/64. And generic API that could accomodate such generators is more future proof and less likely to run into problems with some weird PRNG. In some sense uniformRWord32/64 are more fundamental. They ask to generate N distinct possibilities and uniformWord are just special cases which could be implemented more efficiently for some PRNGs Even for PNRGs that generate full range of Word32/Word64 optimal implementation of uniformRWord32 is different different depending on width of generators' primitive. If generator uses Word32 underneath we should defined uniformRWord32 in terms of uniformWord32. It it uses Word64 we should use uniformWord64 instead — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1?email_source=notifications&email_token=AAABBQRTYDOJEFXJBC5ATO3RDAPOFA5CNFSM4KTSOESKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3SFIA#issuecomment-586621600>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQXJ6FJQVAEV22M6H3TRDAPOFANCNFSM4KTSOESA> .

idontgetoutmuch · 2020-02-16T14:27:21Z

Several fail big crush. I’ll dig up the old code next week or so. Lots of work and personal commitments this week/month to juggle.

The current random fails pretty much everything; on the other hand splitmix passes big crush: https://github.com/tweag/random-quality/tree/master/results.

It should be easy enough to verify the results using the repo.

All RNGs will fail some test for randomness. splitmix is faster than current random and is of higher quality (in the sense that it passes more tests for randomness).

idontgetoutmuch · 2020-02-16T14:35:01Z

Here's the code for random Float from mwc-random

mwcWordToFloat :: Word32 -> Float
mwcWordToFloat x      = (fromIntegral i * m_inv_32) + 0.5 + m_inv_33
    where m_inv_33 = 1.16415321826934814453125e-10
          m_inv_32 =  2.3283064365386962890625e-10
          i        = fromIntegral x :: Int32

similar code exists for splitmix

splitWordToFloat :: Word32 -> Float
splitWordToFloat x = fromIntegral (x `shiftR` 8) * floatUlp
  where
    floatUlp =  1.0 / fromIntegral (1 `shiftL` 24 :: Word32)

@cartazio are you saying both of these are wrong? I couldn't understand what you were driving at in the link you posted.

idontgetoutmuch · 2020-02-16T14:37:11Z

If the monadic interfaces, such as the combinator examples I linked, become the Norm, I think a lot of this complexity goes away.

I can't see any links. The only link you posted is https://github.com/haskell/random/blob/master/src/Data/Distribution/FloatingInterval.hs but I can't see a monadic interface in it.

idontgetoutmuch · 2020-02-17T15:36:24Z

Here's another implementation for getting random floating point numbers

https://github.com/mokus0/random-fu/blob/69a563a7b0cf444748e4b38a8bda7ada0b9acf14/random-source/src/Data/Random/Internal/Words.hs#L103

wordToFloat :: Word64 -> Float
wordToFloat x = (encodeFloat $! toInteger (x .&. 0x007fffff {- 2^23-1 -} )) $ (-23)

I think it would be convenient to put such a function in random rather than lots of packages implementing it themselves but I don't feel strongly about it.

cartazio · 2020-02-17T15:54:17Z

Agreed.

…

On Mon, Feb 17, 2020 at 10:36 AM idontgetoutmuch ***@***.***> wrote: Here's another implementation for getting random floating point numbers https://github.com/mokus0/random-fu/blob/69a563a7b0cf444748e4b38a8bda7ada0b9acf14/random-source/src/Data/Random/Internal/Words.hs#L103 I think it would be convenient to put such a function in random rather than lots of packages implementing it themselves but I don't feel strongly about it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1?email_source=notifications&email_token=AAABBQW2CD3EALAXDSRQIP3RDKVHTA5CNFSM4KTSOESKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL62RSQ#issuecomment-587049162>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQS44LBUXYBRCLCNLWLRDKVHTANCNFSM4KTSOESA> .

Shimuuar · 2020-02-17T20:20:05Z

I think we're discussing many things at once: set of primitive generators, hot to generate floating point, Variate type class, API at large. It becomes somewhat difficult to follow. @lehins, @curiousleo, @idontgetoutmuch would you mind if I create separate issues with summaries of this thread relevant to the issue?

I think it would be convenient to put such a function in random rather than lots of packages implementing it themselves but I don't feel strongly about it.

Completely agree. We should add functions for sampling in ranges as well.

Shimuuar · 2020-02-17T20:24:47Z

I finally looked through PR and I think main problem with RandomPrimGen is lack of universal API. There's simply no way to write code which work both for stateful PRNG like mwc-random and pure ones. This is a big problem since it precludes from writing generic libraries. For example there're quite a lot of code in mwc-random that could and should be generalized.

cartazio · 2020-02-17T21:41:27Z

indeed. like Monads :) write out the distributions in a sortah combinator notation and just provide the base generator as a monadic expression you plop in! (at least thats one way to roll) I'll thinka bout it more

…

On Mon, Feb 17, 2020 at 3:24 PM Aleksey Khudyakov ***@***.***> wrote: I finally looked through PR and I think main problem with RandomPrimGen is lack of universal API. There's simply no way to write code which work both for stateful PRNG like mwc-random and pure ones. This is a big problem since it precludes from writing generic libraries. For example there're quite a lot of code in mwc-random that could and should be generalized. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1?email_source=notifications&email_token=AAABBQSSCUFZOW5I5TO2H43RDLXBBA5CNFSM4KTSOESKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL7SIYY#issuecomment-587146339>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQTXXHZSNTBAFUN4GM3RDLXBBANCNFSM4KTSOESA> .

idontgetoutmuch · 2020-02-18T11:33:43Z

I think we're discussing many things at once: set of primitive generators, hot to generate floating point, Variate type class, API at large. It becomes somewhat difficult to follow. @lehins, @curiousleo, @idontgetoutmuch would you mind if I create separate issues with summaries of this thread relevant to the issue?

@Shimuuar That would be great :)

Shimuuar · 2020-02-18T19:54:06Z

OK I've started.

Scope of changes to random #4 What are we doing here
Set of primitive generators #5 Primitive generators
Generators for floating point numbers #6 generation of floating points

idontgetoutmuch · 2020-04-29T08:09:09Z

I wanted to note which of the issues on https://github.com/haskell/random/ this (soon-to-be) PR addresses:

Very low throughput haskell/random#51
incorrect distribution of randomR for floating-point numbers haskell/random#53
randomR could produce NaNs when the upper bound is infinity haskell/random#54 - I suggest this is a "won't fix" (*)
Why does random for Float and Double produce exactly 24 or 53 bits? haskell/random#58 - this is addressed by Generate Float and Double via division #118 and Unbiased floating point number in unit interval #102.
The seeds generated by split are not independent haskell/random#25

(*) unless someone can specify what the behaviour should be. Currently we have

 map (\s -> runGenState_ (mkStdGen s) (\g -> uniformRM (0.0, 1.0 / 0.0) g) :: Float) [0..9]
[Infinity,Infinity,Infinity,Infinity,Infinity,Infinity,Infinity,Infinity,Infinity,Infinity]

Maybe consistent NaNs would be better saying to the user that their request has no reasonable answer.

Co-Authored-By: Leonhard Markert <[email protected]>

Check issue remains fixed

* Make bitmask-with-rejection non-recursive * INLINE some uniformRM implementations

…`MoandIO`

* Generate Float and Double via division Coverage ======== Before: 0.787% of representable Floats in the unit interval reached After: 7.874% of representable Floats in the unit interval reached (A similar enumeration for Double is impossible, but it is very likely that coverage is increased for Doubles too.) Performance =========== Before: pure/random/Float mean 331.1 μs ( +- 21.67 μs ) pure/uniformR/unbounded/Float mean 324.6 μs ( +- 2.849 μs ) pure/random/Double mean 411.3 μs ( +- 5.876 μs ) pure/uniformR/unbounded/Double mean 416.8 μs ( +- 41.93 μs ) After: pure/random/Float mean 27.32 μs ( +- 158.0 ns ) pure/uniformR/unbounded/Float mean 27.37 μs ( +- 422.0 ns ) pure/random/Double mean 27.34 μs ( +- 303.1 ns ) pure/uniformR/unbounded/Double mean 27.49 μs ( +- 983.7 ns ) * Floating point ranges inclusive in upper bound

Make global StdGen to work in MonadIO

* Remove -fobject-code compilation, since Cmm was removed * Fix example in cabal file * Take care of some compile warnings in legacy benchmarks

… warnigns

Cleanup

Docs ported from #98

Improve 'uniform' and 'uniformR' docs

…encies` for `Frozen`

Refactor mutable generators

Fixes haskell#59 by making 'StdGen' not an instance of 'Read'.

Ready for review

curiousleo · 2020-05-13T08:18:17Z

Further issues addressed:

Add Random instances for tuples haskell/random#26: there are tuple instances for Uniform now, see

random/System/Random/Internal.hs

Lines 902 to 922 in cd5421f

    
           ------------------------------------------------------------------------------- 
        
           -- 'Uniform' instances for tuples 
        
           ------------------------------------------------------------------------------- 
        
           instance (Uniform a, Uniform b) => Uniform (a, b) where 
        
             uniformM g = (,) <$> uniformM g <*> uniformM g 
        
           instance (Uniform a, Uniform b, Uniform c) => Uniform (a, b, c) where 
        
             uniformM g = (,,) <$> uniformM g <*> uniformM g <*> uniformM g 
        
           instance (Uniform a, Uniform b, Uniform c, Uniform d) => Uniform (a, b, c, d) where 
        
             uniformM g = (,,,) <$> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g 
        
           instance (Uniform a, Uniform b, Uniform c, Uniform d, Uniform e) => Uniform (a, b, c, d, e) where 
        
             uniformM g = (,,,,) <$> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g 
        
           instance (Uniform a, Uniform b, Uniform c, Uniform d, Uniform e, Uniform f) => Uniform (a, b, c, d, e, f) where 
        
             uniformM g = (,,,,,) <$> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g 
        
           instance (Uniform a, Uniform b, Uniform c, Uniform d, Uniform e, Uniform f, Uniform g) => Uniform (a, b, c, d, e, f, g) where 
        
             uniformM g = (,,,,,,) <$> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g <*> uniformM g

System/Random.hs:43:1: warning: [-Wtabs] haskell/random#55
read :: StdGen fails for strings longer than 6 haskell/random#59: see StdGen: constructor accessible via Internal only #123
Add Random instance for Natural haskell/random#44: see Add 'instance UniformRange Natural' #126

* Slight improvement in performance. (for small lengths it doubles the performance) * Addition of a couple tests for ByteString generation * Add `Eq` and `NFData` instances for `StdGen` * Add benchmark for generation of `ShortByteString`s

idontgetoutmuch reviewed Feb 14, 2020

View reviewed changes

This was referenced Feb 18, 2020

Set of primitive generators #5

Closed

Generators for floating point numbers #6

Closed

Add script to compare benchmarks (#114)

99990d1

idontgetoutmuch and others added 26 commits April 29, 2020 11:46

Check issue remains fixed

f913dcd

Update test/Spec.hs

b72f1a5

Co-Authored-By: Leonhard Markert <[email protected]>

Respond to comments

8ca0d1e

Merge pull request #117 from idontgetoutmuch/unit-tests

22a374e

Check issue remains fixed

Fix all regressions relative to v1.1 (#116)

7f91c2f

* Make bitmask-with-rejection non-recursive * INLINE some uniformRM implementations

Move randomIO and randomRIO outside of Random class. Switch to …

adc3965

…`MoandIO`

Merge pull request #119 from idontgetoutmuch/monad-io-global-stdgen

41863a2

Make global StdGen to work in MonadIO

Minor cleanups:

0058d77

* Remove -fobject-code compilation, since Cmm was removed * Fix example in cabal file * Take care of some compile warnings in legacy benchmarks

Lagacy benchmarks code formatting and cleanup. Get rid of compilation…

f1cbbb7

… warnigns

Remove unnecessary CPP uses

424ccdf

Fix "default-language" warning, run cabal format

85e6b5c

Merge pull request #120 from idontgetoutmuch/cleanup

a091bfc

Cleanup

Add stack artifacts to .gitignore

609ebc3

Document uniform and uniformR

2013e25

Docs ported from #98

More self-contained docs for uniform and uniformR

4ad8a86

Fix section include

07016de

Merge pull request #121 from idontgetoutmuch/uniform-uniformr-docs

5e78a97

Improve 'uniform' and 'uniformR' docs

Refactor naming of mutable generators and switch to `TypeFamilyDepend…

cd5421f

…encies` for `Frozen`

Merge pull request #122 from idontgetoutmuch/refactor-mutable-gens

f2319cc

Refactor mutable generators

StdGen: constructor accessible via Internal only (#123)

9ee79a7

Fixes haskell#59 by making 'StdGen' not an instance of 'Read'.

Ready for review

eab4f5a

Incorporate feedback

d27ed91

Replace dialect and order traditionally

6969cbb

Merge pull request #124 from idontgetoutmuch/release-notes-new

31f9df2

Ready for review

Add 'instance UniformRange Natural' (#126)

1b07f29

Slight improvement in performance. (#125)

73f1f6e

* Slight improvement in performance. (for small lengths it doubles the performance) * Addition of a couple tests for ByteString generation * Add `Eq` and `NFData` instances for `StdGen` * Add benchmark for generation of `ShortByteString`s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interface to performance #1

Interface to performance #1

lehins commented Feb 12, 2020

cartazio commented Feb 12, 2020 via email

idontgetoutmuch commented Feb 12, 2020 •

edited

Loading

cartazio commented Feb 12, 2020 via email

idontgetoutmuch left a comment

cartazio commented Feb 14, 2020 via email

lehins commented Feb 14, 2020

cartazio commented Feb 14, 2020 via email

lehins commented Feb 14, 2020

Shimuuar commented Feb 14, 2020

idontgetoutmuch commented Feb 14, 2020

idontgetoutmuch commented Feb 15, 2020

Shimuuar commented Feb 15, 2020

cartazio commented Feb 15, 2020 via email

cartazio commented Feb 15, 2020 via email

Shimuuar commented Feb 15, 2020

lehins commented Feb 15, 2020

Shimuuar commented Feb 15, 2020

cartazio commented Feb 15, 2020 via email

idontgetoutmuch commented Feb 16, 2020

idontgetoutmuch commented Feb 16, 2020

idontgetoutmuch commented Feb 16, 2020

idontgetoutmuch commented Feb 17, 2020 •

edited

Loading

cartazio commented Feb 17, 2020 via email

Shimuuar commented Feb 17, 2020

Shimuuar commented Feb 17, 2020

cartazio commented Feb 17, 2020 via email

idontgetoutmuch commented Feb 18, 2020 •

edited

Loading

Shimuuar commented Feb 18, 2020

idontgetoutmuch commented Apr 29, 2020 •

edited by curiousleo

Loading

curiousleo commented May 13, 2020

Interface to performance #1

Are you sure you want to change the base?

Interface to performance #1

Conversation

lehins commented Feb 12, 2020

Proposal

Benefits:

Drawbacks

Performance

TODO:

References

cartazio commented Feb 12, 2020 via email

idontgetoutmuch commented Feb 12, 2020 • edited Loading

cartazio commented Feb 12, 2020 via email

idontgetoutmuch left a comment

Choose a reason for hiding this comment

cartazio commented Feb 14, 2020 via email

lehins commented Feb 14, 2020

cartazio commented Feb 14, 2020 via email

lehins commented Feb 14, 2020

Shimuuar commented Feb 14, 2020

idontgetoutmuch commented Feb 14, 2020

idontgetoutmuch commented Feb 15, 2020

Shimuuar commented Feb 15, 2020

API

Variate class

P.S.

cartazio commented Feb 15, 2020 via email

cartazio commented Feb 15, 2020 via email

Shimuuar commented Feb 15, 2020

lehins commented Feb 15, 2020

Shimuuar commented Feb 15, 2020

cartazio commented Feb 15, 2020 via email

idontgetoutmuch commented Feb 16, 2020

idontgetoutmuch commented Feb 16, 2020

idontgetoutmuch commented Feb 16, 2020

idontgetoutmuch commented Feb 17, 2020 • edited Loading

cartazio commented Feb 17, 2020 via email

Shimuuar commented Feb 17, 2020

Shimuuar commented Feb 17, 2020

cartazio commented Feb 17, 2020 via email

idontgetoutmuch commented Feb 18, 2020 • edited Loading

Shimuuar commented Feb 18, 2020

idontgetoutmuch commented Apr 29, 2020 • edited by curiousleo Loading

curiousleo commented May 13, 2020

idontgetoutmuch commented Feb 12, 2020 •

edited

Loading

idontgetoutmuch commented Feb 17, 2020 •

edited

Loading

idontgetoutmuch commented Feb 18, 2020 •

edited

Loading

idontgetoutmuch commented Apr 29, 2020 •

edited by curiousleo

Loading