Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: return default with get #748

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

rgraber
Copy link

@rgraber rgraber commented Jan 27, 2025

Why is this the best possible solution? Were any other approaches considered?

Mapping classes have a default get method that accepts a default and returns it if the key requested is not found. Though SurveyElement has been updated to rely more on attributes than keys, it breaks the Mapping contract to have get not work as expected with a default (it will currently return an AttributeError if the attribute is not found).

The alternative would be to update getitem to turn an AttributeError into a KeyError so it could be properly caught by the default get method. However, element[key] may still be called in several places and I do not want to risk undermining any assumptions about how it currently works.

What are the regression risks?

There shouldn't be any, this is a return to expected behavior.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

Before submitting this PR, please make sure you have:

  • included test cases for core behavior and edge cases in tests
  • run python -m unittest and verified all tests pass
  • run ruff format pyxform tests and ruff check pyxform tests to lint code
  • verified that any code or assets from external sources are properly credited in comments

@lindsay-stevens
Copy link
Contributor

The dict structure and behaviour of SurveyElement has, and continues to be, a major source of performance and correctness problems. To address this, recently I have been working towards making SurveyElement not a dict, and the change to Mapping (which does not require a get implementation) is part of that transition to preserve some dict-like behaviour while remaining dict-like usages are gradually converted to class usages. The types and attributes of SurveyElement, all of it's subclasses, and to some extent it's dict attributes (like bind), are largely set by the XLSForm and/or XForm spec and/or haven't changed in 5-10+ years, so will eventually be regular classes with set attributes.

If you want to use continue to use SurveyElement externally, I would suggest updating your code to switch from assuming dict access to using attribute access, either:

  • find-and-replace to swap obj.get(key, default) to getattr(obj, key, default)
  • checking the type with isinstance before accessing a potentially non-existent attribute
  • try/catch for AttributeError if the attribute doesn't exist
  • create your own subclasses with custom get

As a side note it would be helpful in future if you please add some context e.g. what is your calling code and use case that requires this change, ideally in the form of an ODK forum discussion. I have searched GitHub a few times for libraries using pyxform as a library, but not found much. There's some references in ona / kpi repos but it seemed also like key parts of pyxform had been copied / vendorised anyway. I consider the xls2xform conversion functions to be the only active external interfaces, and anything else is internals which should not be relied on by external packages (and ideally shouldn't be exposed in the package __init__).

@rgraber
Copy link
Author

rgraber commented Jan 28, 2025

To address this, recently I have been working towards making SurveyElement not a dict, and the change to Mapping (which does not require a get implementation)

According to https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes Mapping should have a get implementation. It gets it from the mixin, so it doesn't need to reimplement it, but I would argue that the contract should still work.

As a side note it would be helpful in future if you please add some context e.g. what is your calling code and use case that requires this change, ideally in the form of an ODK forum discussion.

Would you prefer to continue this discussion there? I thought this would be a bug fix so I just made a straight up PR but since there's more to it I can go to the forum.

@lognaturel
Copy link
Contributor

lognaturel commented Jan 31, 2025

Hi and welcome, @rgraber! @lindsay-stevens has been the primary maintainer of pyxform for some years now and I offer support in various ways. We're both from the ODK team.

I just realized I was supposed to add a note about breaking changes in the release notes for v3.0.0 but did not. I've added it now: https://github.com/XLSForm/pyxform/releases/tag/v3.0.0

Since this issue is developer-facing, let's keep the conversation going here and we can pull in others or increase visibility if needed. 😊 I imagine @ukanga has run into this too and will be interested in following along. I know these breaking changes may appear frivolous but we promise they have user-facing value! As @lindsay-stevens mentioned, form parsing is now much, much faster across the board with some forms that took minutes to go from XLSForm to XForm now taking seconds.

I think neither of us considered that SurveyElement would still implicitly have a get implementation.

At a high level, in what context do you get AttributeErrors now? Like for example maybe you try to access attributes only applicable to a specific SurveyElement subtype on a SurveyElement that may not be that subtype and you need to do this because XYZ.

Contract violation aside, would it be onerous for you to switch to one of the usages @lindsay-stevens has suggested above? I'm asking because as he hinted at, we're contemplating a deeper move away from dict-like access and it'd be helpful to get a sense of impact for you. That will also be info we consider for how to handle the issue in the immediate term -- fix get as you've suggested or move closer to the intent of not having a get.

Are you affected by other breaking changes from 3.0.0 or is this the only one?

@lindsay-stevens and I will work together to get this resolved in the next week. Thanks!

@rgraber
Copy link
Author

rgraber commented Feb 3, 2025

Hello @lognaturel :)
We got the error when trying to access the flat attribute. We had called get_abbreviated_xpath in several places in our code and, when it was taken out, needed to recreate it within our own repository. That includes a call to remove anything that is flat from the path.

We did manage to get around the problem by using hasattr, though we know that's not the desired way. We were prepping for a release though so chose the most expedient thing we could think of.

@rgraber
Copy link
Author

rgraber commented Feb 3, 2025

Here is the PR we used to fix all the issues that came with the upgrade: kobotoolbox/kpi#5442 . It unfortunately has a lot of formatting changes too because of how our formatting process works, but generally the important part is the implementation of get_abbreviated_xpath and a few places where we replaced x['type'] with isInstance(x, Question) and x.type. We left in one or two hasattrs because we didn't know exactly which classes had the attributes we cared about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants