-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PGN test by differentiating single and multigame PGN parsing #42
Conversation
…arsed, handle linebreaks accordingly
Good question. We cannot support every weird syntax or edge cases. Are whitespace lines common in PGN or not? I have no idea. Sorry I can't answer your second question, I haven't thought about it. So maybe that answers my question above :) |
I looked at the lila repo and found something which might be relevant: I have not worked with scala or the lila repo before this, but based on my understanding, when importing PGNs in studies the string is split on the regex The main drawback of this solution will be that it will break an existing test case/scenario where there are no tags at the start of a game which is not the first game in a multi-game PGN string: https://github.com/lichess-org/dartchess/blob/main/data/leading-whitespace.pgn (See fourth game). However, this will be consistent with the current lila implementation. If you are still ok with this, I can have a go at implementing it. If the above is unacceptable, the only alternative solution I can think of will require checking the move numbers. While possible to implement, it will be more complex than the above solution: the current move number will need to be tracked and a lookahead will need to be done for the next move number whenever an empty line is found. |
It is fine by me, this extra test case can be ignored, it is not an important one. It really makes sense that multi game png are separated by headers. We should also keep this test case but with the expected result changed to make it clear we don't support this by specification. |
Updated the PR and marked it Ready for review Fix for #30, based on approach discussed above:
While working on this, I also implemented a fix for #38. This fix is in the Parser but the root cause is different from #30: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this contribution! Just need some clarification on some tests and we can merge this.
data/wcc_2023_eolcomment.pgn
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to see the difference with the existing wcc_2023.pgn. What is it exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new testcase is the file attached in #38. The main difference is on the line shown in the screenshot of that issue. The original file does not have a newline after the {
.
final game = PgnGame.parsePgn(data); | ||
expect( | ||
game.comments, | ||
[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure to understand what wasn't working before here. Parsing the wcc_2023
was working fine as far as I can tell.
Sorry if I didn't get right your explanation in the PR description. Can you try to explain quickly what kind of bug is tested here? (perhaps as a code comment in the test so future readers will know too).
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can find an explanation in the second part of the reply here: #42 (comment) I have also updated the test cases to include a description of the exact scenario being tested (#38).
Just to be clear and avoid any confusion, this PR is fixing both #30 and #38. Both issues are bugs in the PGN parser but in separate areas of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
This fixes #30 by adding another field to _PgnParser indicating whether it is parsing a single game PGN or multi game PGN. However I am marking this as draft because there is an untested scenario for multi game PGNs which will still fail.
The root cause is the following line which resets the parser and treats any empty/whitespace line as end of the game if the parser is in
case _ParserState.moves
.dartchess/lib/src/pgn.dart
Line 779 in 65fad86
Post the fix:
case _ParserState.moves
(new behaviour).case _ParserState.moves
(current behaviour). This is required for current multi game PGN test scenarios to be parsed correctly.This will resolve the issue for single game PGNs tested by the test case, but will still continue failing for multigame PGNs. A simple example scenario which will still fail:
This will be parsed as three separate games due to the empty line before 3. Bb5:
Note that currently there are no test cases for the multigame PGN scenario described here.
@veloce My questions for you:
3. Bb5
is obviously not the start of a new game). However move numbers seem to be completely ignored in the current parsing logic, so this might require significant changes.