-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8/odd character handling causes header headaches #188
Comments
Hello, I notice you use the
You'll see a big difference in the rendering... Hope this resolves the issue, and sorry for the delay (I suggest you try StackOverflow to get a quicker response) |
Thanks for the reply. I'll look over at StackOverflow. I did know about
the Print() and had tested that (but removed to simply the interactions).
It fails with Print as well. I have traced the issue to something about
the characters. Applying a UTF8 Normalize routine seems to fix it, but the
fix is the data 'per field' not with the routine options.
These two strings do not match using "=="
str(a)
#chr "בית אל, ניידת - 7"
str(b)
#chr "בית אל, ניידת - 7"
charToRaw(a)
charToRaw(b)
charToRaw(a)
#[1] d7 91 d7 99 d7 aa c2 a0 d7 90 d7 9c 2c c2 a0 d7 a0 d7 99 d7 99 d7 93
d7 aa c2 a0 2d c2 a0 37
charToRaw(b)
#[1] d7 91 d7 99 d7 aa 20 d7 90 d7 9c 2c 20 d7 a0 d7 99 d7 99 d7 93 d7 aa
20 2d 20 37
These two DO match after performing
mutate (fixedString=utf8_normalize(badString,
map_case=TRUE,map_compat=TRUE,map_quote=TRUE,remove_ignorable=TRUE))
Again, thanks!
…On Sun, Aug 20, 2023 at 2:03 PM Dominic Comtois ***@***.***> wrote:
Hello,
I notice you use the method = argument in the dfSummary() call directly;
take a closer look at the vignette (
https://cran.r-project.org/web/packages/summarytools/vignettes/rmarkdown.html),
you'll see that you need to use print(), i.e.:
print( dfSummary(...), method = 'render')
You'll see a big difference in the rendering... Hope this resolves the
issue, and sorry for the delay (I suggest you try StackOverflow to get a
quicker response)
—
Reply to this email directly, view it on GitHub
<#188 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABWXT26BTCYCOQ53O5JXV6TXWHVJFANCNFSM6AAAAAA26YDFDE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Ok I'll try and look into it in more details, in the meantime feel free to share new insights here! Thx |
Hi. Love dfSummary. I am processing large numbers of dataframes where some fields have Hebrew characters. I've been able to isolate an example where the original column text causes the dfSummary headers to become RMarkdown headings.
By using stringr to remove everything but alpha/numeric and punctuation, it works, but that approach of course assumes I know which fields to process before passing to dfSummary.
Is this just a known limitation, or a bug, or ...
I've provided a reproducible example RMD and htm examples of when it fails and when it works.
dfSummary-issue-20230731.zip
thanks for any insight.
The text was updated successfully, but these errors were encountered: