Skip to content

[Native Image] Printing in Clojure on Windows uses wrong charset #12249

@borkdude

Description

@borkdude

Describe the Issue

When printing in Clojure via the prn or println function, the characters get mangled in Powershell 7 on Windows due to a wrong encoding (Cp1252).

Even when setting: $OutputEncoding = [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new() in powershell and even when calling native-image with "-Djava.file.encoding=UTF-8" "-J-Djava.file.encoding=UTF-8" arguments either at build time or runtime.

Using the latest version of GraalVM can resolve many issues.

GraalVM Version

25 LTS

Operating System and Version

Windows (windows-2022 on Github Actions)

Troubleshooting Confirmation

Run Command

./repro
./repro -Djava.file.encoding=UTF-8
$OutputEncoding = [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new()
./repro
./repro -Djava.file.encoding=UTF-8

Expected Behavior

I expect to see "λ⚙️中文" to be printed in all cases. I want to enforce UTF-8 encoding no matter what is the case on the host system.

Actual Behavior

Given this repro program, I see this output:

(ns my.repro
  (:gen-class))

(set! *warn-on-reflection* true)

(defn -main [& _args]
  (prn (System/getProperty "java.file.encoding"))
  (prn (.getEncoding ^java.io.OutputStreamWriter *out*))
  (prn "λ⚙️中文")
  (alter-var-root #'*out* (constantly (java.io.OutputStreamWriter. System/out)))
  (prn "λ⚙️中文"))
"Cp1252"
"?????"
"λ⚙️中文"
nil
"Cp1252"
"?????"
"λ⚙️中文"
nil
"Cp1252"
"?????"
"λ⚙️中文"
nil
"Cp1252"
"?????"
"λ⚙️中文"

Steps to Reproduce

A full reproducer is available in the following repository:

https://github.com/borkdude/graal-repros/tree/init-at-build-time-output-writer-repro

Note the init-at-build-time-output-writer-repro branch.

Just follow the .github/worksflows/main.yml to see the steps to reproduce.

Additional Context

Clojure uses a java.io.OutputStreamWriter on System.out for printing. This writer it set to a var *out*.
When *out* is replaced with a similar writer at runtime, the characters print using the right encoding.
It might have something to do with build time initialization, but I can't reproduce it using a plain class that I init at build time, see Repro.java.

Run-Time Log Output and Error Messages

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions