Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ANSI or other encoded text files #534

Open
giwul70 opened this issue Feb 1, 2025 · 5 comments
Open

Allow ANSI or other encoded text files #534

giwul70 opened this issue Feb 1, 2025 · 5 comments

Comments

@giwul70
Copy link

giwul70 commented Feb 1, 2025

First of all, compliments on the grepWIN!
Nice program.

Allow me to make a suggestion: please do not restrict to UTF-8 encoded text files but allow files that are encoded slightly different, but are also just simple text files.

Please see attachment.

Image

Image

For files that are indeed UTF-8, grepWIN as an option to export the path\filename AND the matched data and export the results, so it can be handled further. That's really great: I need that data, alongside the file names.

Image

I am new to grepWIN, maybe this suggestion has been made before.

Thanks.

@stefankueng
Copy link
Owner

grepWin tries to detect the encoding automatically, and uses the current code page for text files as default if no specific encoding can be detected.
The checkbox "treat files as utf8" is there to force grepWin to use utf8 instead of what you call ANSI.

So just remove the checkbox, and the text will be treated as ANSI.

@giwul70
Copy link
Author

giwul70 commented Feb 1, 2025

When I uncheck both "Treat files as UTF8" and "Treat files as binary" I won't get any results.
The regex is okay: many other similar files are handled correctly. I believe it has to do with the encoding.
When I open the file in an editor and encode them to UTF8, then grepWIN handles the file like the other ('correct') ones.
I have not found a free batch encoder that recodes all files to UTF8. Doing this one by one is quite a workload.

BTW it is nice to be able to export both filepaths and matches to a text file.

Image

@stefankueng
Copy link
Owner

so maybe those files are not pure text files but have a lot of zero bytes in them.
Without knowing your regex or a sample of a file that doesn't work I can't really help you.

@giwul70
Copy link
Author

giwul70 commented Feb 1, 2025

The very first screenshot is what Notepad is showing when opening such 'non-UTF8' files.
Within grepWIN one time they are listed as binary, the other time as ANSI.
When I open them in EditPadPro the encoding reads Windows 1252 Western Digital.

Now in EditPadPdo I may select UTF-8 and select to encode the original data with another character set (UTF-8)
When I do that, grepWIN will handle the file correctly.
Problem is: I have some 1650 of those files...
(The regex works fine on all the other similar files)

Anyway, bad luck.

Thanks for the replies.

@stefankueng
Copy link
Owner

screenshots don't help. I really need the regex and at least one file that doesn't match that regex but should.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants