Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: alternative translations for C0 code #1

Open
huangqqss opened this issue Jan 21, 2025 · 1 comment
Open

Feature request: alternative translations for C0 code #1

huangqqss opened this issue Jan 21, 2025 · 1 comment

Comments

@huangqqss
Copy link

huangqqss commented Jan 21, 2025

I found this project when I was looking for methods to display ASCII art of cp437, via this discussion https://forums.freebsd.org/threads/cp437-display-on-console.91866/ .

Now that cp437 defines symbols for the codepoints of BS, CR, NL, FF, ESC, and TAB, the tool will be better if these codepoints have customizable translations. I think there can be options, one for each of the above codepints, to designate the tranlation style.

For example, code point CR can be translate to a literal carriage return or ♪ (U+266A). Further, the dos style "\r\n" also needs an option: whatever CR and NL get translated into, should "\r\n" be regarded as a line break? Since CR, NL, and "\r\n" can all lost the function of line breaking, the option -w is even more meaningful for some ASCII art.

For the codepoint of NUL, there can be an option to assign the symbol. Actually I prefer showing NUL as whitespace but sometimes as dot. Although NUL is seldom used in ASCII art, its display is useful when viewing binary files or incorrectly decoded Unicode text.

@Zirias
Copy link
Owner

Zirias commented Mar 19, 2025

Let's check whether I understood you correctly: You want to selectively disable interpretation of control characters, so that the symbol (from the VGA BIOS) is printed instead?

Right now, the tool works on a "virtual VGA card" which provides a "put character" function that works exactly like the old DOS console output functions work: It only knows a very limited set of control characters (0x00, 0x07, 0x08, 0x09, 0x0a, 0x0d, 0x1b) , for everything else, the character is placed "as is" in the virtual video memory. You can read that code here for reference: https://github.com/Zirias/dos2ansi/blob/master/src/bin/dos2ansi/vgacanvas.c#L131

In addition, the file reader by default "swallows" 0x1a characters, because that's what DOS file reading functions interpreted as an end-of-file marker, so this character also can't appear on the screen.

So far, I didn't see any point in making that behavior configurable, because there's no way in a vanilla DOS to make such characters appear on screen from a file ... IOW any "ansi art" file meant for display with e.g. TYPE can only ever contain these in their meaning as control characters. AFAIK, the only way to display the actual characters on a DOS PC is by programmatically putting them into video memory. Therefore, can you explain to me for what kind of files such a feature would be useful?

Regarding CR and NL, they are interpreted in the way they were originally meant, one of them putting the cursor back into column 0, the other one moving it one line down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants