-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Select json data from database #1791
Conversation
""" | ||
selections = selections if selections else ["$"] | ||
if page_size is None: | ||
limit = float("+inf") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
limit = float("+inf") | |
limit = sys.maxsize |
Could you use sys.maxsize instead of converting a string to a float?
""" | ||
Evaluate the where expression. | ||
""" | ||
if op == "=": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be clearer to use match \ case instead of if \ elif?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's Python 3.10+.
:param where: A single where-expression (see below) | ||
:type where: tuple or list | ||
:param sort_by: A list of expressions to sort on | ||
:type where: tuple or list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:type where: tuple or list | |
:type sort_by: tuple or list |
:param page: The page number to return (zero-based) | ||
:type page: int | ||
:param page_size: The size of a page in rows; None means ignore | ||
:type page: int or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:type page: int or None | |
:type page_size: int or None |
:param page: The page number to return (zero-based) | ||
:type page: int | ||
:param page_size: The size of a page in rows; None means ignore | ||
:type page: int or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:type page: int or None | |
:type page_size: int or None |
:param page: The page number to return (zero-based) | ||
:type page: int | ||
:param page_size: The size of a page in rows; None means ignore | ||
:type page: int or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:type page: int or None | |
:type page_size: int or None |
@Nick-Hall, before we invest too much time in reviewing and addressing issues, we need a couple of questions answered:
|
If I read the code correctly, the current implementation of Any query would still need transforming from "gramps SQL" into the SQL dialect used by the DB given gramps support of different DBs and their differing syntax, especially for querying JSON. |
Yes, that is correct. I thought that this was a compromise to add just a little bit that can be converted into SQL, but also can be run without it if needed (Pure-Python version). There has been a lot of debate over this issue over the years, and I didn't want to wade into that.
The line I am drawing is that if you need a JOIN, then you should write a "business logic" method.
As long as the selected handles are less than the total rows, then it will be faster, but at the expensive of some memory. We can actually make a decision in the rule.prepare() to decide to not make the
Yes, another reason to keep this simple: it will be overloaded in MongoDb, etc. |
Would MessagePack be useful?
|
Maybe for some functions, but I don't think anything related to the latest work on Gramps. JSON is useful because it is self-documenting, allows the database to be used outside of Gramps, and can be directly queried by SQL. |
I'm going to close this for now, as I don't want SQL issues getting in the way of moving forward with the filter fixes. |
This PR adds a generic method for querying the database. It has a pure-python implementation (in case some database backend cannot implement it, or for testing, but not designed for real use) and a DB-API implementation. The pure-python version could be removed from this PR.
Motivation
As we create an optimized filter API, we will need to create methods for querying the data using the power of the underlying database (such as SQL, or MongoDB). Some of these methods will require "business logic" methods that require JOINS or other more complicated queries.
However, a large number of queries are simple and can be implemented by simply querying the JSON data in a table.
Example
There are many examples in the 274 filter rules. Here is one that is used often:
_hastag.py
. This is used to determine if an object contains a tag.Currently, it is designed to examine every object to see if the tag_handle is in the tag_list.
However, that can be re-written as:
The key line is:
Decision
The choice is:
db.select()
, that will prevent having to write dozens of one-line methodsWe need to make this decision before touching the 274 rules for the filter refactor/optimize PR.