-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Refs #32519 -- Added JSONSet and JSONRemove functions. #18758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refs #32519 -- Added JSONSet and JSONRemove functions. #18758
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the distinct MR @adzshaf.
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should pass this as a parameterized text[]
otherwise this is a SQL injection vector
Try
SELECT '{foo",bar}'::text[];
key_paths = key.split(LOOKUP_SEP) | |
key_paths_join = ",".join(key_paths) | |
new_source_expressions.append(Value(f"{{{key_paths_join}}}")) | |
key_paths = key.split(LOOKUP_SEP) | |
new_source_expressions.append(Value(key_paths)) |
If it doesn't work by itself you might need to pass an explicit output_field=ArrayField(TextField())
.
We also likely want to add tests for using JSONSet
with keys containing commas and quotes
JSONSet("data", **{"key',\"}": "foo"})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Done in 85d3a44.
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to make it a Value
here instead of in each implementations?
else value | |
else Value(value, output_field=self.output_field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do this in 75dd7b5.
However, for some reason, Value(None)
is interpreted as SQL NULL on PostgreSQL, so I had to wrap it in Value
again. Before this change, I am pretty sure it was only Value(None)
and not Value(Value(None))
. Do you have any idea why? @charettes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange, Value(Value(...))
should not even work as only non-expression are allowed to be passed to Value
.
I think you'll have to special case value is None
until there's a way to explicitly differentiate between both.
I guess you could give a shot at defining
class JSONNull(BaseExpression):
output_field = JSONField()
def as_sql(self, connection, compiler):
return "%s", [connection.ops.adapt_json_value(None)]
And then using it like
JSONNull() if value is None else Value(value, output_field=self.output_field)
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto this is a SQL injection vector as keys are not sanitized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how this can be a SQL injection vector. The compile_json_path()
function uses json.dumps()
for each key in the path, so it will escape quotes and backslashes properly. Then, we use Value
which will separate the expression into the SQL template and parameters in its as_sql()
.
I added some tests in 08f07dc. Although, I just noticed it seems SQLite cannot handle double quotes in the path. See also https://stackoverflow.com/questions/67993982. I think we can address this with a feature flag.
However, if there is a better way to do it, please let me know. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the key JSON quoting in this case, I thought it was the same thing as in the as_postgres
method.
The issue with double quotes on SQLite and likely with single quotes on Oracle in generalized in ticket-32213 and ticket-35842.
docs/topics/db/queries.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we point out why JSONSet
is preferable here to avoid similar problems as the one described in why F
objects should be used?
The current docs point out that it can be used as an alternative but not why; mainly to avoid race conditions that would overwrite the full settings
instead of specific keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is a good idea! I'll get to it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 6bbc16f3c1668fd3d571816b7ee7774960d50be8.
08f07dc
to
fe9c715
Compare
36f5018
to
6be9101
Compare
34f8dba
to
f6ae03d
Compare
The PR is ready for another review. Please feel free to share your comments or suggestions 😊 @laymonage @charettes @sarahboyce Tagging @felixxm in case you are interested and have some time to take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @adzshaf ⭐ added a couple of small docs comments before I go on vacation
Hope to review this more thoroughly in the new year 🎄
0854e45
to
c6b60da
Compare
Thank you for the suggestions! I already fixed the comments on docs. Feel free to share more comments! @sarahboyce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some code suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif output_type == "JSONField": | |
if output_type == "JSONField": |
Superfluous elif
The
elif
statement is not needed, as thereturn
statement will always break out of the enclosing function. Removing theelif
will reduce nesting and make the code more readable.
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key, value = list(self.fields.items())[0] | |
key, value = next(iter(self.fields.items())) |
Calling list(...)
will create a new list of the entire collection ... If you only need the first element of the collection, you can use next(iter(...)
to lazily fetch the first element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif connection.vendor in ["postgresql", "mysql"]: | |
elif connection.vendor in {"postgresql", "mysql"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"theme": {"type": "dark", "opacity": decimal.Decimal(100.0)}, | |
"theme": {"type": "dark", "opacity": decimal.Decimal("100.0")}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theme__opacity=decimal.Decimal(50.0), | |
theme__opacity=decimal.Decimal("50.0"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"theme": {"type": "dark", "opacity": decimal.Decimal(100.0)}, | |
"theme": {"type": "dark", "opacity": decimal.Decimal("100.0")}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theme__opacity=decimal.Decimal(50.0), | |
theme__opacity=decimal.Decimal("50.0"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have yet to review the docs and tests, but the implementation seems close! Not sure about some of the feature flags, though. I think having supports_partial_json_update
is fair (given that we need it for Oracle). The others may be a bit too specific, but it also helps us in testing.
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's probably done for consistency with the other DB vendors, but I think in this case we can make a small optimization by skipping copy()
for the base case:
copy = self.copy() | |
all_items = self.paths | |
path, *rest = all_items | |
if rest: | |
copy.paths = (path,) | |
return JSONRemove(copy, *rest).as_oracle( | |
compiler, connection, **extra_context | |
) | |
return super(JSONRemove, copy).as_sql( | |
path, *rest = self.paths | |
if rest: | |
copy = self.copy() | |
copy.paths = (path,) | |
return JSONRemove(copy, *rest).as_oracle( | |
compiler, connection, **extra_context | |
) | |
return super().as_sql( |
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem necessary
if isinstance(value, Value): | |
# We do not need Cast() because we use the FORMAT JSON clause instead. | |
value = Value(value, output_field=self.output_field) | |
new_source_expressions.append(value) | |
new_source_expressions.append(value) |
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy = self.copy() | |
all_items = list(self.fields.items()) | |
key, value = all_items[0] | |
rest = all_items[1:] | |
copy = self.copy() | |
(key, value), *rest = self.fields.items() |
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy = self.copy() | |
all_items = list(self.fields.items()) | |
key, value = all_items[0] | |
rest = all_items[1:] | |
copy = self.copy() | |
(key, value), *rest = self.fields.items() |
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably be a bit stricter in the checks to ensure we don't make assumptions for Value
s that don't have a JSONField
output_field
.
if isinstance(value, Value): | |
if isinstance(value, Value) and isinstance(value.output_field, JSONField): |
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, so that we don't do Cast()
if the user does not want to
# If it's a Value, assume it to be a JSON-formatted string. | |
# Use Cast to ensure the string is treated as JSON on the database. | |
if isinstance(value, Value): | |
# Use Cast to ensure the string is treated as JSON on the database. | |
if isinstance(value, Value) and isinstance(value.output_field, JSONField): |
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return {**super().get_repr_options(), **self.fields} | |
return {**super()._get_repr_options(), **self.fields} |
django/db/models/functions/json.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work, because we use positional arguments for the paths, not kwargs.
We had to resolve the paths separately instead of passing it as *expressions
to Func
(because some DBs need to use recursive calls instead of arbitrary number of args). Unfortunately there's no equivalent of _get_repr_options
for the positional args, so we have to override __repr__
if we really want the nice representation, e.g.
def _get_repr_options(self): | |
return {**super().get_repr_options(), **self.fields} | |
def __repr__(self): | |
args = self.arg_joiner.join(str(arg) for arg in self.source_expressions) | |
paths = self.arg_joiner.join(str(path) for path in self.paths) | |
return f"{self.__class__.__name__}({args}, {paths})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need supports_partial_json_update
to depend on supports_json_field
on SQLite.
c6b60da
to
179cde5
Compare
This allows us to mark serialized JSON data as JSON instead of a regular string when passing it to the database.
179cde5
to
00ea5b9
Compare
HI @adzshaf, thanks for being so diligent responding to feedback. I rebased just to see how CI looked. Do you have time to push this forward? This would now target 6.1. |
Trac ticket number
ticket-32519, supersedes #18489
Branch description
This PR introduces the
JSONSet
andJSONRemove
functions to make partial updates toJSONField
s directly on the database. Consider following example to update aJSONField
usingJSONSet
.You can also remove a key by using
JSONRemove
.Checklist
main
branch.