Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User search function, if the user name has - and. Some special symbols will not be found when searching #16675

Closed
matrixbot opened this issue Dec 21, 2023 · 1 comment

Comments

@matrixbot
Copy link
Collaborator

matrixbot commented Dec 21, 2023

This issue has been migrated from #16675.


Description

User AA-71 exists. homeserver is matrix.org
图片
The user cannot be found by searching aaa-71
图片
My request limit is 50
图片
There are only six of them
图片

aaa-71 is registered on your matrix.org? There's no way to limit it to 50, none of the best local matches were found, and only six were returned?

Steps to reproduce

  • list the steps
  • that reproduce the bug
  • using hyphens as bullet points

Homeserver

matrix.org

Synapse Version

matrixdotorg/synapse:latest

Installation Method

Docker (matrixdotorg/synapse)

Database

PostgreSQL

Workers

Single process

Platform

ubuntu

Configuration

No response

Relevant log output

2023-11-22 11:03:19,246 - synapse.storage.SQL - 449 - DEBUG - POST-222893 - [SQL] {search_user_dir-118fa1} WITH matching_users AS ( SELECT user_id, vector FROM user_directory_search WHERE vector @@ to_tsquery('simple', ?) LIMIT 10000 ) SELECT d.user_id AS user_id, display_name, avatar_url FROM matching_users as t INNER JOIN user_directory AS d USING (user_id) LEFT JOIN users AS u ON t.user_id = u.name WHERE user_id != ? ORDER BY (CASE WHEN d.user_id IS NOT NULL THEN 4.0 ELSE 1.0 END) * (CASE WHEN display_name IS NOT NULL THEN 1.2 ELSE 1.0 END) * (CASE WHEN avatar_url IS NOT NULL THEN 1.2 ELSE 1.0 END) * ( 3 * ts_rank_cd( '{0.1, 0.1, 0.9, 1.0}', vector, to_tsquery('simple', ?), 8 ) + ts_rank_cd( '{0.1, 0.1, 0.9, 1.0}', vector, to_tsquery('simple', ?), 8 ) ) * (CASE WHEN user_id LIKE ? THEN 2.0 ELSE 1.0 END) DESC, display_name IS NULL, avatar_url IS NULL LIMIT ?
2023-11-22 11:03:19,246 - synapse.storage.SQL - 454 - DEBUG - POST-222893 - [SQL values] {search_user_dir-118fa1} ("('eafe':* | 'eafe') & ('dsds':* | 'dsds')", '@0xe52000e012ce8851fcb9532adcc066db55fa53c8:matrix-dev.defed.network', "'eafe' & 'dsds'", "'eafe':* & 'dsds':*", '%:matrix-dev.defed.network', 21)

Anything else that would be useful to know?

import icu
import re
from typing import List

def parse_words_with_icu(search_term: str) -> List[str]:
results = []
breaker = icu.BreakIterator.createWordInstance(icu.Locale.getEnglish())

breaker.setText(search_term)
i = 0
while True:
    j = breaker.nextBoundary()
    if j == icu.BreakIterator.DONE:
        break

    result = search_term[i:j]
    print(result)

    # libicu considers spaces and punctuation between words as words, but we don't
    # want to include those in results as they would result in syntax errors in SQL
    # queries (e.g. "foo bar" would result in the search query including "foo &  &
    # bar").
    if len(re.findall(r"([\w\-]+)", result, re.UNICODE)):
        results.append(result)

    i = j

return results

调用函数并打印结果

if name == "main":
print("输入: hongtao:aaa")
search_string = "hongtao*aaa"
parsed_words = parse_words_with_icu(search_string)
print(parsed_words)

图片
@matrixbot matrixbot changed the title Dummy issue User search function, if the user name has - and. Some special symbols will not be found when searching Dec 22, 2023
@matrixbot matrixbot reopened this Dec 22, 2023
@erikjohnston
Copy link
Member

Hopefully fixed by #17254

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants