-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
languages: copy over latest version from sourcegraph #828
base: main
Are you sure you want to change the base?
Conversation
I realised we haven't been updating this package as we updated the package in the sourcegraph repo. We don't need all the functionality it has, but its easier to just copy paste everything. Test Plan: go test
Fuzz test failed on commit 78a7ea4. To troubleshoot locally, use the GitHub CLI to download the seed corpus with
|
Ok there are issues here. The issue is boils down to how we changed normalization between the two versions of the package. I actually have bigger concerns around this now. We just lookup by But essentially I feel like when we read in index data from disk we should normalize by the rules of the running process. Then we do the same normilization when executing a query. There is one other issue where we directly look into the go-enry maps to get out extensions. That code also seems very fragile. cc @jtibshirani who I think looked at all this code recently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the lib/codeintel/languages
package to a public repo and add a dependency on that? Otherwise, having to maintain this copy separately is going to get tedious quickly, and reviewing changes is also not that simple.
Wrt the normalization, I don't fully understand the requirements -- do we need to maintain some kind of compatibility between successive versions of the package?
Agreed, this makes sense to be shared. However, there is a tricky requirement around testing since the implementation of this package is tied directly to the version of go-enry. Maybe what we can do is expose a package which contains the tests, then as part of CI for sourcegraph and zoekt we run those tests to ensure we are compatible with the resolved version of go-enry?
We do case-sensitive string comparisons on the language value. I noticed that while doing this upgrade the casing we chose for some stuff changed. I didn't exactly track it down, but I believe it has something to do with things migrating between being inside of go-enry vs our custom support (I believe the code made different decisions around deciding to ToLower a string). I need to look into it more to give you a definite answer, but I saw enough to mark this PR as draft :) |
+1 let's do that! Could we commit to |
@varungandhi-src @keegancsmith I'd like to revisit this PR. What's your thinking on moving the |
@jtibshirani Yes, let's do it. I'm OOO for Lunar New Year next week. If you want to do this in the fastest way possible, feel free to copy the code to a separate public repo. (I can work on merging it with other code later.) If you want to keep it with other code, I recommend creating a If you don't have time the coming week, I can follow up on it after coming back. You should be able to get repo maintainer/admin level access through Entitle. |
I realised we haven't been updating this package as we updated the package in the sourcegraph repo.
We don't need all the functionality it has, but its easier to just copy paste everything.
Test Plan: go test