-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease size of "small" word list #3
Comments
Would this be useful? Im looking to use this library to generate a 3 or 4 word passphrase but the words currently produced by the library are too obscure. I think using this list of 60k words could be useful. |
@fredspivock If you were able to find or compile a list that was freely available without purchase or licensing I'd be happy to update the library with it! It looks like that one requires payment but maybe it'd be allowed if you were only taking the words themselves and not the data attached to them? Might be worth asking. |
@MrXyfir I got excited! You are right, it is paid. He even includes a link to a much larger list but it seems like he did some deduping on it. I also noticed it contains proper names so that could be a deal breaker for some. |
@fredspivock I think 10,000 is too small, unless maybe we rename our current small to medium and use the 10,000 for small? That could work. Ideally I'd like our primary list to be around 40-60k. If you could find one that'd be great! And yeah preferably without names. |
I was able to bring the size down by roughly 5,000 by removing bad/offensive words that shouldn't have been in there anyways. See #5 |
Given the introduction of "small" and "big" word lists in v3.1.0, I'd like to decrease the size of the "small" word list from ~129k words to 100k at most, and possibly even down to 50k.
The small list should be acceptable for general use, and as lightweight as possible. We should remove any stop words, and any super rare words that you can hardly find a definition for.
The text was updated successfully, but these errors were encountered: