Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cp1252 as default encoding for Strings #300

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hkalteBr
Copy link

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.
@hkalteBr hkalteBr changed the title Update pyads_ex.py Fix ADS Error 1797 for Strings with special characters Dec 21, 2021
@stlehmann
Copy link
Owner

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.

@hkalteBr would you mind bringing an example for this? As strings are 1-Byte characters there should be the same length for type string and type byte.

@stlehmann
Copy link
Owner

stlehmann commented Dec 21, 2021

grafik

Alright, I see it. Still this is something that actually shouldn't work because TwinCAT strings are only one byte long and you couldn't write a non-ASCII character to a string variable. You need to use WSTRING datatype for this.

@hkalteBr
Copy link
Author

I'm using the attribute utf-8 encoding for some strings: {attribute 'TcEncoding':='UTF-8'}
So it's a String datatype do you have another workaround?

@stlehmann
Copy link
Owner

This is a hard one. Honestly, I didn't know about this attribute. It will be hard to read values encoded this way because it is not determined which size a character has. Also we don't know the encoding on the client side.

@hkalteBr
Copy link
Author

I've been experimenting now a little bit with the string encoding in Twincat. According to the infosys it's an cp1252 encoding for the normal string. Shouldn't we integrate that one instead of the utf-8?
I tried it with the write_by_name function (adsSyncWriteReqEx) and now you can write with ASCII 256 and the plc shows it the right way.
Encoding_cp1252

It doesn't solve my problem with the utf-8 encoded Strings but at least you can write and read special characters with the normal strings.

@stlehmann
Copy link
Owner

Thanks for looking into this. I remember having the encoding issue with symbol comments. I think we already changed the encoding to cp1252 for them. But we use utf-8 for the normal read/write operations at the moment. Would you mind sharing the link to the documentation?

@hkalteBr
Copy link
Author

https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_plc_intro/2529327243.html&id=
I only found it for Twincat 3 but i guess it's the same in Twincat 2.
Did you have a reason for using utf-8 for the normal read/write operations?

@stlehmann
Copy link
Owner

My initial guess is that no-one really thought about the encoding too much at that time. To address this issue I suppose we could add an attribute to the read/write functions for string_encoding. Alternatively this attribute could be set in the Connection object.

@stlehmann stlehmann changed the title Fix ADS Error 1797 for Strings with special characters Use cp1252 as default encoding for Strings Dec 23, 2021
@stlehmann
Copy link
Owner

It is good practice to open an issue before opening a PR so I created #301 on this topic where we can further discuss the issue.

Copy link
Owner

@stlehmann stlehmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String encoding is an issue concerning most of the functions for read/write activities. So finally they all need to be altered. Also I feel we need to add some tests concerning encoding.

@@ -1089,7 +1089,7 @@ def adsSumWrite(
if data_name in structured_data_names:
buf[offset: offset + data_symbols[data_name].size] = value
elif data_symbols[data_name].dataType == ADST_STRING:
buf[offset: offset + len(value)] = value.encode("utf-8")
buf[offset: offset + len(value.encode("utf-8"))] = value.encode("utf-8")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good starting point. We should introduce a constant DEFAULT_TCENCODING = "cp1252". Also we should buffer the encoded value to avoid multiple encodings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants