Use cp1252 as default encoding for Strings #300

hkalteBr · 2021-12-21T13:21:50Z

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.

stlehmann · 2021-12-21T14:38:47Z

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.

@hkalteBr would you mind bringing an example for this? As strings are 1-Byte characters there should be the same length for type string and type byte.

stlehmann · 2021-12-21T14:42:21Z

Alright, I see it. Still this is something that actually shouldn't work because TwinCAT strings are only one byte long and you couldn't write a non-ASCII character to a string variable. You need to use WSTRING datatype for this.

hkalteBr · 2021-12-21T14:47:33Z

I'm using the attribute utf-8 encoding for some strings: {attribute 'TcEncoding':='UTF-8'}
So it's a String datatype do you have another workaround?

stlehmann · 2021-12-21T16:55:45Z

This is a hard one. Honestly, I didn't know about this attribute. It will be hard to read values encoded this way because it is not determined which size a character has. Also we don't know the encoding on the client side.

hkalteBr · 2021-12-22T08:27:28Z

I've been experimenting now a little bit with the string encoding in Twincat. According to the infosys it's an cp1252 encoding for the normal string. Shouldn't we integrate that one instead of the utf-8?
I tried it with the write_by_name function (adsSyncWriteReqEx) and now you can write with ASCII 256 and the plc shows it the right way.

It doesn't solve my problem with the utf-8 encoded Strings but at least you can write and read special characters with the normal strings.

stlehmann · 2021-12-22T09:35:07Z

Thanks for looking into this. I remember having the encoding issue with symbol comments. I think we already changed the encoding to cp1252 for them. But we use utf-8 for the normal read/write operations at the moment. Would you mind sharing the link to the documentation?

hkalteBr · 2021-12-22T09:46:33Z

https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_plc_intro/2529327243.html&id=
I only found it for Twincat 3 but i guess it's the same in Twincat 2.
Did you have a reason for using utf-8 for the normal read/write operations?

stlehmann · 2021-12-23T06:51:47Z

My initial guess is that no-one really thought about the encoding too much at that time. To address this issue I suppose we could add an attribute to the read/write functions for string_encoding. Alternatively this attribute could be set in the Connection object.

stlehmann · 2021-12-23T06:58:45Z

It is good practice to open an issue before opening a PR so I created #301 on this topic where we can further discuss the issue.

stlehmann

String encoding is an issue concerning most of the functions for read/write activities. So finally they all need to be altered. Also I feel we need to add some tests concerning encoding.

stlehmann · 2021-12-23T07:02:38Z

pyads/pyads_ex.py

@@ -1089,7 +1089,7 @@ def adsSumWrite(
        if data_name in structured_data_names:
            buf[offset: offset + data_symbols[data_name].size] = value
        elif data_symbols[data_name].dataType == ADST_STRING:
-            buf[offset: offset + len(value)] = value.encode("utf-8")
+            buf[offset: offset + len(value.encode("utf-8"))] = value.encode("utf-8")


This is a good starting point. We should introduce a constant DEFAULT_TCENCODING = "cp1252". Also we should buffer the encoded value to avoid multiple encodings.

Update pyads_ex.py

587b16d

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.

hkalteBr changed the title ~~Update pyads_ex.py~~ Fix ADS Error 1797 for Strings with special characters Dec 21, 2021

stlehmann changed the title ~~Fix ADS Error 1797 for Strings with special characters~~ Use cp1252 as default encoding for Strings Dec 23, 2021

stlehmann mentioned this pull request Dec 23, 2021

Use cp1252 as default encoding for Strings #301

Open

stlehmann requested changes Dec 23, 2021

View reviewed changes

stlehmann mentioned this pull request Dec 23, 2021

Add parameter to specify the string encoding to read/write functions and/or Connection class #302

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use cp1252 as default encoding for Strings #300

Use cp1252 as default encoding for Strings #300

hkalteBr commented Dec 21, 2021

stlehmann commented Dec 21, 2021

stlehmann commented Dec 21, 2021 •

edited

Loading

hkalteBr commented Dec 21, 2021

stlehmann commented Dec 21, 2021

hkalteBr commented Dec 22, 2021

stlehmann commented Dec 22, 2021

hkalteBr commented Dec 22, 2021

stlehmann commented Dec 23, 2021

stlehmann commented Dec 23, 2021

stlehmann left a comment

stlehmann Dec 23, 2021

Use cp1252 as default encoding for Strings #300

Are you sure you want to change the base?

Use cp1252 as default encoding for Strings #300

Conversation

hkalteBr commented Dec 21, 2021

stlehmann commented Dec 21, 2021

stlehmann commented Dec 21, 2021 • edited Loading

hkalteBr commented Dec 21, 2021

stlehmann commented Dec 21, 2021

hkalteBr commented Dec 22, 2021

stlehmann commented Dec 22, 2021

hkalteBr commented Dec 22, 2021

stlehmann commented Dec 23, 2021

stlehmann commented Dec 23, 2021

stlehmann left a comment

Choose a reason for hiding this comment

stlehmann Dec 23, 2021

Choose a reason for hiding this comment

stlehmann commented Dec 21, 2021 •

edited

Loading