-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automate Generation of QueryParser to C# #268
Comments
I looked into this a bit last night, and the situation is less than ideal. To augment/update the original two possibilities in the original issue from 2019 with some other ideas (items 3+ below):
At this point, I don't know if it makes sense to try to attempt this for 4.8, and that porting work is already done. I think we can revisit this for the next post-4.8 release, so I'm moving this back to the Future milestone for now. |
@paulirwin - Actually, out of all of the packages, this one is the least "done".
Having a While having no TryParse and having it throw exceptions to loop back to the prior position in a string are also pretty bad, query strings don't tend to be too long, so it is probably okay for this use case. But fixing it so it doesn't throw exceptions just to loop back would be much easier if we generated the code for the query string grammar rather than hand porting it like this. |
I wanted this to be feasible, as I think it's a good goal. And I wish that were "easier," but see my comment above. None of these automated approaches are particularly easy. The generated files are only ~750 lines of Java each, compared to how many thousands of lines and hours of porting javacc, or having ANTLR syntax that does not match upstream, or having to add rewriting and build task support to JavaToCSharp. javacc is not a trivial syntax to implement a generator for, unfortunately. I could envision easily spending hundreds of hours on this problem, just to save myself one or two fewer orders of magnitude by just updating the generated code by hand (including even getting it up to date as of 10.0 for the next release). This is compounded by the chicken-and-egg problem of needing to know what our desired generated code would look like with those exception and i18n changes you mention above, before we know how to write a generator for it, at which point we might as well hand-port the first version of that anyways, regardless of if we automate it or not. Of note, the Lucene team commits their generated Java code to the repo, so it will be easy to do diffs to see what changed, if we want to continue to hand-port it in the future. If we decide to update the generated code manually to factor in the requested changes above, I think we can split that off as its own issue, since that would be separate from "Automate Generation of QueryParser to C#." And that seems like a reasonable thing to include in 4.8. |
The requested changes weren't part of the generated code, so we are talking about 2 different things. Given your input, I think that doing this at a later point makes sense if we even do it at all. After working with the implementation of the BCL parsers, it seems like a far better approach to use spans for something like this. Basically, go back to requirements and change the whole approach on how it is done. But that is a lot of work and we should probably wait until later. |
Regarding error message localization, honestly, in this day and age, if one really wanted them translated, they could catch the exception and then use an LLM to translate the message to the language(s) of their choice. That may be too pragmatic of a solution, but it would work. :-) |
Well, that is the problem. Catching the exceptions are required. The only way to display them to a user is to first catch them and then show the message. .NET parsers usually have a But, it doesn't have that. I see your point though, Ron. As long as we keep doing things the wrong way and throwing the exception, there is not much point in doing the localization because the user has the chance to intervene (since they are forced to catch an exception anyway). But the exception is a problem carried over from Java where it is considered acceptable to catch an exception for control flow, which is not the case in .NET. Even if we did a cheesy hack and caught the exceptions for the user just to give them a TryParse method, it would at least fix the API and take the responsibility of catching the exceptions away from them. We could then clean it up later without breaking the API. By that, I mean creating an extra method named TryParse and leaving Parse alone. |
The Lucene team is using a tool called javacc to generate the main business logic behind the query parsers. If we had a similar tool it could help:
The javacc tool uses a configuration file as input and creates java code as output. Here are some examples of those configuration files:
This has not been fully researched, but there are at least 2 potential ways we could approach this:
It seems according to this document that using a port of javacc should be our first choice because of the performance benchmarks of the resultant code. And certainly that would eliminate the risk of having a .NET tool not support an option that we need either now or for some future version of Lucene.
JIRA link - [LUCENENET-620] created by nightowl888The text was updated successfully, but these errors were encountered: