-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get word from complement lazy determinization #420
Conversation
9e4b886
to
1400462
Compare
1400462
to
b1dd803
Compare
@@ -249,7 +249,7 @@ namespace { | |||
} | |||
|
|||
// add moves of S to the sync ex iterator | |||
// TODO: shouldn't we also reset first? | |||
synchronized_iterator.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened here? Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after looking at it previously, I still cannot explain how it managed to work without the reset. One should intuitively clear the iterator for every macrostate. clear()
is a fast O(1) operation, so I added it as this is what I would do if I implemented the function again. I can take a look once more and probably try to debug it to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I have done some more digging and I got it now. It works thanks to a lucky coincidence. Since we have been up until now always using the SynchronizedIterator
(called advancer further for clarity) only to fully iterate over the whole vectors iterated over, the advancer remains clear after the while loops while(iterator.advance())
. That holds only because at the end of every advance()
, the iterators kept in the advancer are popped from the advancer when they iterated all the way to their respective end iterators. If we were to stop the iteration earlier for any macrostate, the iterators would remain in the advancer and the next macrostate would only add new iterators to these iterators from the previous macrostate. The advancer would iterate over all of them (the remaining ones and the new one which one actually want to iterate over) together, causing the advance()
to not work correctly.
As a rule of thumb, we should therefore always clear our advancers to make sure that this happenstance cannot occur. Henceforth, I consider this change to add clear()
everywhere a valid one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, can't say I fully understand it, but we will test it by including it in noodler.
We will see how it works. I used iterators to the subset map, as it is 10 times faster on even the smallest NFAs, and presumably more on larger ones. Whether it is enough remains to be seen. |
This PR implements a method to get an arbitrary word from a complement of a language lazily, as requested by #415. That is, we start computing the complement by determinizing the existing NFA while making it complete. We immediately stop when we encounter a macrostate which is not final. Then we can return the access word for this macrostate as an arbitrary word from the complement.
The PR utilizes the overall algorithm of determinization, but does not run nfa::determinize() directly, as there are too many changes to extract any common operations without significantly overengineering the function.
The PR implements the callback function for nfa::determinize() discussed in #415, but since the proper handling of all edge cases proved to be more complex, the callback function is not being used in
Nfa::get_word_from_complement()
in the end. The callback might yet come in handy for lazily getting an arbitrary word from the language difference, which will be implemented in a future PR.This PR implements the first requested operation from #415.