Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V3 casl 561 query tags #370

Open
wants to merge 5 commits into
base: v3
Choose a base branch
from
Open

V3 casl 561 query tags #370

wants to merge 5 commits into from

Conversation

Amaneusz
Copy link
Collaborator

No description provided.

Signed-off-by: Jakub Amanowicz <[email protected]>
Signed-off-by: Jakub Amanowicz <[email protected]>
Signed-off-by: Jakub Amanowicz <[email protected]>
@@ -160,6 +161,80 @@ class WhereClauseBuilder(private val request: ReadFeatures) {
}
}

private fun whereTags() {
request.query.tags?.let { tagQuery ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why .let ? Maybe I'm old-fashioned, but it looks a little weird, as I understand it replaces if(tags != null), but I had to read it twice to understand :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are totally right, this is not clear - will change

Signed-off-by: Jakub Amanowicz <[email protected]>

@JsName("of")
constructor(tag: String, key: String, value: Any?): this() {
constructor(tag: String, key: String, value: Any?) : this() {
this.tag = tag
this.key = key
this.value = value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add some tests for the value, to ensure that NakshaException with NakshaError.ILLEGAL_ARGUMENT is thrown, if the value is invalid. We can be kind here, and automatically convert all numbers, except for Long and Int64 to Double (we can even allow all 64-bit values that are less than 53-bit long).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added similar test in TagMapTest

*
* By default, (if no special prefix is found) tag is normalized with NFD, lowercased, cleaned of non-ASCII and splittable.
*/
object TagNormalizer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use object, I removed it everywhere. If this should be a singleton, use the following syntax:

@JsExport
class TagNormalizer private constructor() {
  companion object TagNormalizer_C {
    @JvmStatic
    @JsStatic
    fun foo() {}
  }
}

The reason are plenty of bugs in the JavaScript compiler/linker combination. Sadly, the compiler (for objects) automatically names the companion object, and this naming does not work very well, when used across different modules, because somehow the linker messes up naming. We need to open a ticket with Jetbrains about it, but I would first want to have a working code, then to show the issues in the working code, so they can simply checkout, compile and debug.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"#" to TagProcessingPolicy(NFD, removeNonAscii = true, lowercase = false, split = true)
)

private val PRINTABLE_ASCII_CODES = 32..128
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 128 is no printable character, so this should be 32..127, see: https://www.ascii-code.com/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True - this comment will not be valid anymore as I switched to your earlier solution

val policy = policyFor(tag)
var normalized = Platform.normalize(tag, policy.normalizerForm)
normalized = if (policy.lowercase) normalized.lowercase() else normalized
normalized = if (policy.removeNonAscii) removeNonAscii(normalized) else normalized
Copy link
Member

@xeus2001 xeus2001 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why this was made slower?

private var AS_IS: CharArray = CharArray(128 - 32) { (it + 32).toChar() }
private var TO_LOWER: CharArray = CharArray(128 - 32) { (it + 32).toChar().lowercaseChar() }

normalized = if (policy.lowercase) {
  if (policy.removeNonAscii)
    removeNonAscii(normalized, TO_LOWER)
  else
    normalized.lowercase()
} else if (policy.removeNonAscii) removeNonAscii(normalized, AS_IS) else normalized

private fun removeNonAscii(text: String, MAP: CharArray)
  val sb = StringBuilder()            
  for (element in normalized) {
    // Note: This saves one branch, and the array-size check, because 0 - 32 will become 65504.
    val c = (element.code - 32).toChar()
    if (c.code < MAP.size) {
        sb.append(MAP[c.code])
    }
  }
  return sb.toString()
}

P.S.: This code only loops the string ones, not twice, when lowercase and non-ascii is requested, and it does not need two copies of the string.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see that it got slower, I was more focused on readability - nice remark, I will adjust it

.firstOrNull { (prefix, _) -> tag.startsWith(prefix, ignoreCase = true) }
?.value
?: DEFAULT_POLICY
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is wrong in one thing: The prefix is case-sensitive, so ignoreCase must not be used, Ref_ != ref_!

Actually, another thing of this notation is, that it is hard to read for everyone that is not an Java/Kotlin expert. I would like to avoid streaming-APIs, unless there is a big benefit, so this code should be (IMHO):

private fun policyFor(tag: String): TagProcessingPolicy {
  for ((prefix, policy) in PREFIX_TO_POLICY) {
    if (tag.startsWith(prefix)) return policy
  }
  return DEFAULT_POLICY
}

This does not make the code slower, nor longer, but much more readable IMHO. Lets please use loops, unless there is realy specific situation in which a streaming-api provides an advantage.

There is another point against streaming API, I saw this so often now. People tend to ignore, that behind a stream there is a loop, and you then often find things like this, especially with inexperienced developers:

val a = m.map { (k, v) -> v.a }
val b = m.map { (k, v) -> v.b }
...

I really saw this often. If that would have been a loop, it would become clear that you can collect both properties in one loop, you do not need two loops, but the code hides the loop and therefore makes it hard to understand what happens.

When loops can be collapsed, it should be done, looping above things is not free of cost IMHO.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The firstOrNull triggers loop that exits on the first predicate-check or return nulls if nothing satisfied the condition. The loop runs only once. As you mentioned - all map, filter etc operations trigger traversal, sometimes its full, sometimes not (like in this original case).

The remarkd regarding readability is more crucial IMO, if you don't "feel it" then let's ditch it - c-like style should be fine for all so I'm switching to your suggestion

}
}

private tailrec fun whereNestedTags(tagQuery: ITagQuery) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the compiler give a warning? I think this only becomes a tailrec function, if you make the not, or, and, and multiClause explicitly an inline function or? otherwise we leave it to the JIT, and the JIT is not able to do tailrec, it does not know about it, or do I miss something here (does Kotlin inline the private functions anyway)?

Copy link
Collaborator

@cyberhead-pl cyberhead-pl Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tailrec works on compilation level, and it will try to replace recursion with while loop during compilation - that's all, this is not a suggestion for runtime

Copy link
Collaborator Author

@Amaneusz Amaneusz Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are couple of things to consider here

  • tailrec does not need function to be marked as inline
  • tailrec as keyword is a compiler hint, nothing more - by itself it does nothing
  • for the tailrec hint to work, the function must satisfy some other rules for the compile-time optimization to take place

The last point is where I screwed up ;) tailrec won't be applied here because function must end with self-call for optimization to take place. So here, the tailrec keyword won't do anythign. It won't help, it won't throw error.
i removed it, it will use simple recursion explicitly.

source: https://kotlinlang.org/docs/functions.html#tail-recursive-functions

Signed-off-by: Jakub Amanowicz <[email protected]>
val i = normalizedTag.indexOf('=')
val key: String
val value: Any?
if (i > 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail for a=1, because in this case index will be 1

private fun resolveSingleTagQuery(tagQuery: TagQuery) {
when (tagQuery) {
is TagExists -> {
where.append("$tagsAsJsonb ?? '${tagQuery.name}'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is prone to SQL injections and ASCII zero injections. Values that are provided by the client, like tag-names, must only be used in prepared-statements! You can quote this alternatively, still, it may have a bad impact on performance, when it is executed multiple times, because it can't be cached by the query parser (very likely does not matter in this case, because tag queries will be slow anyway).

}

is TagValueIsNull -> {
where.append("${tagValue(tagQuery)} = null")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not work, it need to be something IS NULL, NULL is like NaN, any value compared against it will be false (well, not exactly, but it is good to remember it like this).


is TagValueMatches -> {
val regex = tagQuery.regex
where.append("$tagsAsJsonb @?? '\$.${tagQuery.name} ? (@ like_regex \"${regex}\")'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is prone to SQL and ASCII zero injections, because not correctly escaped (remember, tags that start with @ allow all unicode characters! For regular expression, we should review carefully, if there is no way of code execution or query manipulation.

return when (castTo) {
null -> "$tagsAsJsonb->'${tagQuery.name}'"
PgType.STRING -> "$tagsAsJsonb->>'${tagQuery.name}'"
else -> "($tagsAsJsonb->'${tagQuery.name}')::${castTo.value}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this prone to SQL injections and ASCII zero attacks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants