Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Search: more robust symbol detection #48212

Closed
jtibshirani opened this issue Feb 24, 2023 · 1 comment
Closed

Search: more robust symbol detection #48212

jtibshirani opened this issue Feb 24, 2023 · 1 comment
Labels
ranking team/search-platform Issues owned by the search platform team

Comments

@jtibshirani
Copy link
Member

jtibshirani commented Feb 24, 2023

Zoekt currently uses ctags to detect symbols in documents and determine their types. Symbol information has an important role in ranking -- we boost files containing symbol matches highly in the search results.

We've traced several complaints about surprising ranking to incorrect ctags output:

Also here's an example where we mistake comments for symbols:

Screenshot 2023-02-24 at 12 11 50 PM

It would be great to switch to a more robust way of detecting symbols. Maybe we could integrate more closely with code intel and take advantage of our work on SCIP and tree-sitter? This issue is meant to document + brainstorm the problem.

/cc @sourcegraph/search-core

@jtibshirani jtibshirani added team/search-platform Issues owned by the search platform team ranking labels Feb 24, 2023
@jtibshirani
Copy link
Member Author

Closing in favor of https://github.com/sourcegraph/sourcegraph/issues/52467.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ranking team/search-platform Issues owned by the search platform team
Projects
None yet
Development

No branches or pull requests

1 participant