Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Article for how to deal with IDL field name collisions. #172

Open
wants to merge 5 commits into
base: gh-pages
Choose a base branch
from

Conversation

tfoote
Copy link
Contributor

@tfoote tfoote commented Apr 30, 2018

We've identified that we need a standard way to deal with reserved keywords in different languages. We cannot ban the union of keywords as we don't know them for all potential future languages.

This is the high level of what I collected from our discussion of how to deal with field name. There's still several TODOs and things that could be fleshed out with more details. However please review it for the high level logic and if there's any other significant topic that should be covered here as well.

@dirk-thomas
Copy link
Member

I am not convinced this needs a separate article. I would prefer adding this it the existing articles instead. Either to the hogh level rosidl one as a new section or to the articles describing the language specific mappings.

In the case that a new language is added and there is a detected existing symbol, the language specific generator is expected to implement a basic deconfliction mangling of the symbol to avoid the conflict.

Each generator will define the mangling procedure and provide a list of symbols for which the procedure will be applied.
This list should be all known keywords or otherwise conflicting symbols.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the parser will not have knowledge of the generator's specific mangling, it may be worth mentioning that each generator will need to ensure that each field is mangled (if needed) and to check that the resulting fields don't collide wit each other (e.g. someone defining a field class and a field class_ in the same msg)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added more specific language to say that this sort of accidental collision should be made not possible.

@tfoote
Copy link
Contributor Author

tfoote commented Apr 30, 2018

Yeah, it could be integrated into the IDL article. Please comment here and after getting the feedback here. I'll refactor it into the rosidl article.


To avoid conflicts generally all constants should be prefixed with a `k`. This will avoid almost all conflicts.

Google Style Guide: https://google.github.io/styleguide/cppguide.html#Constant_Names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kind of section is too language specific. Instead I would phrase it that each language generator has to apply "some" mangling to avoid collisions with keywords as well as other symbols (like platform specific defines).

And in general the generated code will follow the style guide for that specific language. So I wouldn't tangle it with that topic here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want the constants to be all upper case in the IDL file and then the c++ generator renames FOO to kFOO? I kind of feel like that's going to end up with a lot of divergance and decrease readability if the field names differ between implementations. foo.kCONSTANT would end up as foo.CONSTANT in python.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see this comment when I added my comment below in the second round of review.

The point that the k prefix is a C++ specific decision. What if another language comes by where it is recommended to use j as the prefix? Why do we favor one over the other? Therefore I think the IDL should be agnostic to that and any mangling should be handled in a language specific way even if that means that the constant might have different names in different languages.


### Stability
Each language generator will provide a standard name deconfliction policy so that users can use the symbols in a stable manner.
If a change is needed the generated code should provide a full cycle of backwards compatibility to support changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be part of the article. Also in case we discover a collision not known before I don't think there needs / can't be a tick-tock cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed


* `NO_ERROR`
* `DELETE`
* `ERROR`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced the extra effort for supporting both for a period of time is necessary. Imo we could just apply the mangling approach described above and let users update their code base. I think doing that rather soon and then keeping this backward compatibility stuff (which requires extra effort to be implemented) is preferred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to have some level of consistency. I've changed it to say that we'll manually port back duplicate mangled constants in Melodic so that a codebase can span both ROS1 and ROS2. Then we won't effect ROS2's implementation.

@mikaelarguedas mikaelarguedas added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels May 3, 2018

#### Naming Constants

Constant names must be uppercase alphanumeric characters with underscores for separating words except the first character which should be a lowercase k.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this referring to the naming in the IDL or to the generated C++ code? If it is the former I would argue that we shouldn't put this constrain into the IDL itself. In some languages ERROR is just fine. Therefore I would leave such a convention to the language specific code generator.


Clearly this would collide if someone defined `class` and `class_`, so slightly more unique string is recommended for clarity and uniqueness.
The mangling should be designed so that accidental collisions between names and mangled names cannot happen.
One easy way to do this would be to a language specific valid character not valid in the standard field name specification.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A common approach - which doesn't rely on an extra character - is to apply the mangling in in both cases. So if a user defined FOO and FOO_ both are getting mangled to FOO_ and FOO__.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't applying the name mangling in both cases would mean that users would always have to access fields by mangled names? e.g. in C++, msg->class_, as needed, but also msg->data_, which is undesired (but at least it's consistent). I know that originally the mangling topic was brought up with a focus on enumerations (where perhaps this comment doesn't apply), but this document seems to be about fields in general.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just pointing out that the described key-value mapping in the current patch isn't sufficient. Not specific to fields or constants.

FOO: FOO_
new: new_
delete: delete_
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the comment above the table will not define the "only" keys which need to be mangled. "Derived" variations of the key (e.g. the mangle name) need to be considered and mangle too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the mangling strategy is producing symbols that are not allowed in the msg spec (like adding a trailing underscore as pointed out in https://github.com/ros2/design/pull/172/files#r194139732) we could get away without having to consider the mangled names as we would not have conflicts between use defined fields and mangles names by design


Clearly this would collide if someone defined `class` and `class_`, so slightly more unique string is recommended for clarity and uniqueness.
The mangling should be designed so that accidental collisions between names and mangled names cannot happen.
One easy way to do this would be to a language specific valid character not valid in the standard field name specification.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current state of ros2 idl restrictions prevents fields from being defined with a trailing underscore. If we aren't planning to relax that (I think it wasn't imposed in ros1), then perhaps a trailing underscore can always used for mangling, without any potential for accidental collisions?

If you are using the legacy defines it's required to use the namespaced reference instead of using a shorthand version as those are know to conflict.

For an easy migration path.
The mangled version of the constants will be made available in ROS Melodic by manually defining them in parallel so that code can be ported and use the same constant values while the codebase is spanning ROS1 and ROS2 implementations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this work that has been started already or something we will do at some point during Melodic lifetime ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in progress Actively being worked on (Kanban column)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants