Working on software that interacts with different charset encodings? Need a list of files in different encodings to test on? Use the files in data!
Some of the files found are from this project: https://github.com/gogs/chardet and is reflected in LICENSE