Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add performance support for JSON_MATCH_ANY on S2 8.0+ #58

Open
harisenbon opened this issue Feb 26, 2023 · 3 comments
Open

Add performance support for JSON_MATCH_ANY on S2 8.0+ #58

harisenbon opened this issue Feb 26, 2023 · 3 comments

Comments

@harisenbon
Copy link

Hey all,

I'm working with the S2 team, and we've run into huge performance issues when querying on multiple JSON fields in S2 (see Support ticket #27009)

Example:

WHERE
JSON_EXTRACT_STRING(`value`, 'lccontent') = 'paid' OR
JSON_EXTRACT_STRING(`value`, 'lccontent') = 'cpm' OR
JSON_EXTRACT_STRING(`value`, 'lccontent') = 'cps' OR
JSON_EXTRACT_STRING(`value`, 'lccontent') = 'cpc'

With the JSON improvements in version 8, we can replace that with something like this, which runs at 1.25s without load:

JSON_MATCH_ANY(MATCH_PARAM_STRING_STRICT() LIKE '%paid%', `value`, 'lccontent') OR
JSON_MATCH_ANY(MATCH_PARAM_STRING_STRICT() LIKE '%cpm%', `value`, 'lccontent') OR
JSON_MATCH_ANY(MATCH_PARAM_STRING_STRICT() LIKE '%cps%', `value`, 'lccontent') OR
JSON_MATCH_ANY(MATCH_PARAM_STRING_STRICT() LIKE '%cpc%', `value`, 'lccontent')

However, the Laravel driver doesn't currently support JSON_MATCH_ANY out of the box.

I've gotten it working on my system, using the following changes to SingleStore\Laravel\Query\Grammar

Functions to determine if we should use this new code:

/**
 * Get the SingleStore engine we're running on
 *
 * @param Builder $query
 * @return string
 */
private function getVersion(Builder $query)
{
    return Str::after($query->getConnection()->getPdo()->getAttribute(PDO::ATTR_CLIENT_VERSION), ' ');
}

/**
 * Whether we should do the JSON override or not (isJsonSelector & v > 8)
 * 8.0+ has a number of improved JSON functions to speed up queries
 *
 * @param Builder $query
 * @param         $where
 * @return bool
 */
private function doJsonOverride(Builder $query, $where)
{
    return $this->isJsonSelector($where['column']) &&
        version_compare($this->getVersion($query), '8.0') >= 0;
}

We then create a function to wrap the default query with JSON_MATCH_ANY, swapping in MATCH_PARAM_* for the colum:

/**
 * Wrap the returned data in a JSON_MATCH_ANY query for S2 performance
 *
 * @param Builder $query
 * @param         $where
 * @param         $originalCallable
 * @return string
 */
protected function wrapJsonMatchAny(Builder $query, $where, $originalCallable)
{
    // Swap out our column, and call the base query
    $column = $where['column'];
    $where['column'] = DB::raw('SINGLESTORE_JSON_MATCH');
    $whereSQL = call_user_func($originalCallable, $query, $where);

    // Set up our JSON Query
    // -------------------------------------------

    // Break apart the column name from the JSON keypath.
    [$field, $path] = $this->wrapJsonFieldAndPath($column);

    // TODO: Get the matching type from the value
    $matchType = 'MATCH_PARAM_STRING_STRICT()';

    // Re-assemble and return
    // -------------------------------------------
    $whereSQL = str_replace('SINGLESTORE_JSON_MATCH', $matchType, $whereSQL);
    return "JSON_MATCH_ANY($whereSQL, {$field}{$path})";
}

Finally, we apply the override to each where* function that we want support for:

protected function whereBasic(Builder $query, $where)
{
    return $this->doJsonOverride($query, $where) ?
        $this->wrapJsonMatchAny($query, $where, 'parent::whereBasic') :
        parent::whereBasic($query, $where);
}

Let me know if a PR would be preferred.

@AdalbertMemSQL
Copy link
Collaborator

Hi @harisenbon
Thanks for your request.
As for me, it looks a little bit weird that JSON_EXTRACT_STRING(value, 'lccontent') = 'paid' works much slower than JSON_MATCH_ANY(MATCH_PARAM_STRING_STRICT() LIKE '%paid%', value, 'lccontent').
I would do some investigations before adding this feature.

@harisenbon
Copy link
Author

@AdalbertMemSQL

As for me, it looks a little bit weird that it works much slower

Yeah, I thought so too ;)

I would love to use , JSON_EXTRACT_STRING (as it gives some type casting benefits) but I have a month of tickets with support about how I should be using JSON_MATCH_ANY instead.

@AdalbertMemSQL
Copy link
Collaborator

I just had a chance to test this thing.
Used 8.0.12 version of the SingleStore.

MySQL [db]> select * from t where JSON_EXTRACT_STRING(`value`, 'lccontent') = 'asdas';
Empty set (0.163 sec)

MySQL [db]> select * from t where JSON_MATCH_ANY(MATCH_PARAM_STRING_STRICT() LIKE 'asdas', `value`, 'lccontent');
Empty set (0.479 sec)

Looks like JSON_EXTRACT_STRING worked ~3 times faster.
Then I tried bigger JSON values and the difference was not so significant, but even with very big JSON values, JSON_EXTRACT_STRING is slightly faster.

MySQL [db]> select * from t where JSON_EXTRACT_STRING(`value`, 'lccontent') = 'asdas';
Empty set (1.525 sec)
MySQL [db]> select * from t where JSON_MATCH_ANY(MATCH_PARAM_STRING_STRICT() LIKE 'asdas', `value`, 'lccontent');
Empty set (1.873 sec)

@harisenbon Can you please clarify, what version of SingleStore are you using and what the schema/size of your JSON data is?

Probably, this performance degradation happens only in some specific cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants