-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Breaking change of dataclasses.dataclass
comparison semantics in 3.13+
#128294
Comments
dataclass.dataclass
comparison semtaticsdataclasses.dataclass
comparison semtatics
In 3.12 we used to do something like: Lines 1085 to 1094 in 3a726be
Now, instead we do: Problems:
|
Indeed, we have the problem of two chairs here. From my perspective, the right decision would be reverting as soon as possible until major Linux distros do not adopt 3.13 widely. In advance, I could give some context on a domain where the issue araises first. It is machine learning where data classes are used a lot for neural network representation and maintaining weight collections. Comparison is broken there because overridden |
dataclasses.dataclass
comparison semtaticsdataclasses.dataclass
comparison semantics
Considering that we largely short-circuit equality in general, I think we should revert it. In addition, we are now out-of-sync with "This method compares the class as if it were a tuple of its fields, in order" (emphasis mine). While this can be a breaking change, we haven't updated the documentation, so maybe it hasn't been observed. Most of the time, two objects that are identical (in terms of pointers) should also compare equal, independently of whether there is a custom cc @Yhg1s as the 3.13 RM. |
dataclasses.dataclass
comparison semanticsdataclasses.dataclass
comparison semantics in 3.13+
I will send a fix ASAP. |
See also #120645. I don't think that we could back out the change at this point. |
From perspective of PEP-557, data classes should be compared as tuples since it explicitly states that Data Classes can be thought of as “mutable namedtuples with defaults”. I guess that making data classes as close as possible to named tuples was a primary design goal. For this reason, breaking change inroduced by #120645 was not the wisest decision. from collections import namedtuple
from dataclasses import dataclass
Point = namedtuple('Point', ['x', 'y'])
p1 = Point(0.0, float('nan'))
p2 = Point(0.0, p1.y)
assert p1 == p2 # OK
@dataclass
class Point:
x: float
y: float
p1 = Point(0.0, float('nan'))
p2 = Point(0.0, p1.y)
assert p1 == p2 # OK (3.12); FAIL (3.13). |
On other hand, we rarely write Perhaps in future we will add a per-field boolean option for using identity check. |
I think that in this case (no revert) we need to update the docs to reflect the behaviour change. The OP has several examples of older wordings like: - *eq*: If true (the default), an :meth:`~object.__eq__` method will be
generated. This method compares the class as if it were a tuple
of its fields, in order. |
Indeed, dataclasses usually are not sequences but they are fancy sequences. At least, it was design goal according to PEP-557. Anyway, I'd like to hear @ericvsmith opinion on this matter since he initially drafted data classes. Another consequence of this breaking change is compatibility with third party data class libraries (e.g. general-purpose
From my perspective, new semantic of |
I think it's unfortunate that there wasn't a warning of this change. But at this point it has been released, and rolling it back is problematic. Maybe a better notice in the release notes would be enough? I don't think the PEP comment dataclasses of can be thought of as “mutable namedtuples with defaults” implies that they should be implemented with tuple semantics where possible. The statement is a very high level overview of the functionality. @rhettinger : what are your thoughts? |
Bug report
Bug description:
Brief Description
Optimization done in #109870 changed semantic of
dataclass
comparison.Description
The pseudo code below shows meaning of changes have been done in #109870. The semantic differs largely because of a shortcut in
__eq__
implementation of sequence-like containers (see Objects/object.c). The shortcut essentially doesself[i] is other[i]
. Consequently, method__eq__
ofself[i]
is not evaluated for identical objects in 3.12 duringdataclasses.dataclass
comparison.According Python docs (citation below), v3.13 introduces breaking change since it does not consider fields as a tuples for dataclass comparison.
Test Case
CPython versions tested on:
3.13 (3.12)
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: