Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserved MySQL keywords as field names result in syntax errors on string restriction #1196

Open
CBroz1 opened this issue Dec 2, 2024 · 1 comment

Comments

@CBroz1
Copy link
Contributor

CBroz1 commented Dec 2, 2024

Bug Report

Description

There are no guardrails in place to prevent a user from declaring a field called 'match', for example, which will require escaping in order to perform string restrictions.

Reproducibility

Include:

  • OS Linux
  • Python Version 3.9.18
  • MySQL Version 8.0.34
  • MySQL Deployment Strategy (local-native | local-docker | remote)
  • DataJoint Version 0.14.3
  • Minimum number of steps to reliably reproduce the issue
  • Complete error stack as a result of evaluating the above steps
Reproduction
import datajoint as dj

schema = dj.schema("temp")


@schema
class BadField(dj.Lookup):
    definition = """
    id: int
    ---
    match : bool
    """
    contents = [(1, True), (2, False), (3, True)]


if __name__ == "__main__":
    print(BadField() & "id = 1")
    print(BadField() & "`match` = 1")
    print(BadField() & "match = 1")
Error stack
I [77]: run temp.py
*id    match
+----+ +-------+
1      1
 (Total: 1)

*id    match
+----+ +-------+
1      1
3      1
 (Total: 2)

---------------------------------------------------------------------------
QuerySyntaxError                          Traceback (most recent call last)
File ~/wrk/spyglass/temp-btl.py:19
     17 print(BadField() & "id = 1")
     18 print(BadField() & "`match` = 1")
---> 19 print(BadField() & "match = 1")

File ~/miniconda3/envs/rio/lib/python3.9/site-packages/datajoint/expression.py:671, in QueryExpression.__repr__(self)
    660 def __repr__(self):
    661     """
    662     returns the string representation of a QueryExpression object e.g. ``str(q1)``.
    663
   (...)
    666     :rtype: str
    667     """
    668     return (
    669         super().__repr__()
    670         if config["loglevel"].lower() == "debug"
--> 671         else self.preview()
    672     )

File ~/miniconda3/envs/rio/lib/python3.9/site-packages/datajoint/expression.py:676, in QueryExpression.preview(self, limit, width)
    674 def preview(self, limit=None, width=None):
    675     """:return: a string of preview of the contents of the query."""
--> 676     return preview(self, limit, width)

File ~/miniconda3/envs/rio/lib/python3.9/site-packages/datajoint/preview.py:13, in preview(query_expression, limit, width)
     11 if width is None:
     12     width = config["display.width"]
---> 13 tuples = rel.fetch(limit=limit + 1, format="array")
     14 has_more = len(tuples) > limit
     15 tuples = tuples[:limit]

File ~/miniconda3/envs/rio/lib/python3.9/site-packages/datajoint/fetch.py:236, in Fetch.__call__(self, offset, limit, order_by, format, as_dict, squeeze, download_path, *attrs)
    234         ret = return_values[0] if len(attrs) == 1 else return_values
    235 else:  # fetch all attributes as a numpy.record_array or pandas.DataFrame
--> 236     cur = self._expression.cursor(as_dict=as_dict)
    237     heading = self._expression.heading
    238     if as_dict:

File ~/miniconda3/envs/rio/lib/python3.9/site-packages/datajoint/expression.py:658, in QueryExpression.cursor(self, as_dict)
    656 sql = self.make_sql()
    657 logger.debug(sql)
--> 658 return self.connection.query(sql, as_dict=as_dict)

File ~/miniconda3/envs/rio/lib/python3.9/site-packages/datajoint/connection.py:343, in Connection.query(self, query, args, as_dict, suppress_warnings, reconnect)
    341 cursor = self._conn.cursor(cursor=cursor_class)
    342 try:
--> 343     self._execute_query(cursor, query, args, suppress_warnings)
    344 except errors.LostConnectionError:
    345     if not reconnect:

File ~/miniconda3/envs/rio/lib/python3.9/site-packages/datajoint/connection.py:299, in Connection._execute_query(cursor, query, args, suppress_warnings)
    297         cursor.execute(query, args)
    298 except client.err.Error as err:
--> 299     raise translate_query_error(err, query)

QuerySyntaxError: ("You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '= 1)) ORDER BY `id` LIMIT 13' at line 1", 'SELECT `id`,`match` FROM `cbroz_temp`.`#bad_field` WHERE ( (match = 1)) ORDER BY `id` LIMIT 13')

Expected Behavior

A naive user might not be familiar with SQL keywords and expect the following to behave similarly.

MyTable & 'my_field = 1' # works, no reserved keyword
MyTable & {'match':1} # works, escaped keyword
MyTable & 'match = 1'

DataJoint could potentially...

  1. prevent field names like this, and raise an error similar to using underscores in table names
  2. escape field names in string restrictions
  3. on SyntaxError, scan the where clause for reserved keywords and suggest the user restrict with a dictionary instead.

Screenshots

n/a

Additional Research and Context

n/a

@CBroz1 CBroz1 added the bug label Dec 2, 2024
@CBroz1 CBroz1 changed the title Reserved MySQL keywords as field names result in syntax errors on restriction Reserved MySQL keywords as field names result in syntax errors on string restriction Dec 2, 2024
@dimitri-yatsenko
Copy link
Member

This is a known problem that has been raised from time to time. DataJoint does not prevent users from using attribute names that are reserved words in MySQL, e.g. key, select, signal. Our decision was to address this correctly in all cases where datajoint has control over the syntax (e.g. declarations and queries), but not in cases where the user supplies their own SQL conditions as strings. This should be addressed in the documentation or perhaps in the error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants