Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor types and names #299

Merged
merged 17 commits into from
Aug 18, 2023
Merged

Refactor types and names #299

merged 17 commits into from
Aug 18, 2023

Conversation

Adda0
Copy link
Collaborator

@Adda0 Adda0 commented Aug 15, 2023

This PR refactors some types, modifies some names, and in general resolves the majority of the issues from #292 and #283.

I suggest reviewing the code commit after commit in order to get an idea of what each change does and what is the impact of that change.

Further changes regarding transitions iterator and alphabet will come in their respective PRs as those are rather large modifications.

@codecov
Copy link

codecov bot commented Aug 15, 2023

Codecov Report

Patch coverage: 84.13% and project coverage change: +0.29% 🎉

Comparison is base (0259d79) 82.58% compared to head (a539cc8) 82.88%.
Report is 2 commits behind head on devel.

Additional details and impacted files
@@            Coverage Diff             @@
##            devel     #299      +/-   ##
==========================================
+ Coverage   82.58%   82.88%   +0.29%     
==========================================
  Files          51       51              
  Lines       10599    10526      -73     
  Branches     1063     1034      -29     
==========================================
- Hits         8753     8724      -29     
+ Misses       1443     1414      -29     
+ Partials      403      388      -15     
Files Changed Coverage Δ
include/mata/nfa/nfa.hh 89.28% <ø> (-0.37%) ⬇️
include/mata/nfa/plumbing.hh 75.00% <ø> (ø)
include/mata/utils/sparse-set.hh 76.56% <ø> (+0.75%) ⬆️
include/mata/utils/util.hh 82.81% <ø> (-1.68%) ⬇️
src/strings/nfa-segmentation.cc 93.42% <0.00%> (ø)
tests/nfa/nfa-profiling.cc 0.00% <0.00%> (ø)
src/alphabet.cc 53.19% <53.12%> (ø)
src/strings/nfa-noodlification.cc 71.49% <60.00%> (+0.06%) ⬆️
src/nfa/nfa.cc 59.68% <75.00%> (-2.40%) ⬇️
tests/strings/nfa-string-solving.cc 90.50% <75.00%> (ø)
... and 12 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Adda0 Adda0 force-pushed the refactor_types_and_names branch from d0590b7 to d0b1e8b Compare August 15, 2023 07:36
@Adda0 Adda0 marked this pull request as ready for review August 15, 2023 07:40
Copy link
Member

@tfiedor tfiedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major issues.

@@ -26,6 +26,8 @@
namespace Mata {

using Symbol = unsigned;
using Word = std::vector<Symbol>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, you probably discussed this with Lukas, but are you sure it is good to introduce new types, after we aggresively removed lots of types? :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did discuss it. The plan is that Word and WordName are to be the interface for working with alphabets. And therefore deserve a type of their own. The user will work with word names (a sequence of strings representing names of each symbol in a word), while Mata works internally with words (a sequence of Mata::Symbols). This is the place to discuss this idea with all of us, so feel free to give your opinion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thing having this one named is good, it appears on many places later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will keep it, at least until we see how well it works with refactored alphabets (coming soon to you Mata clone!).

* @param[out] state_renaming Mapping of trimmed states to new states.
* @return New @c Nfa instace from @c this with trimmed states.
*/
Nfa trimmed(StateRenaming* state_renaming = nullptr) const { return Nfa{ *this }.trim(state_renaming); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this naming is confusing, having trim and trimmed will only confuse users, so I suggest to brainstorm some better name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that the operations that work in-place will usually have the name after the operation (usually a verb) and what it does to the instance the operation is called on. The operations returning a new instance will be named after the result, the new instance, which in this case is the trimmed NFA. Again, this is the idea. Let us discuss this here. trim() and trimmed() are the example of how it will look like. If we agree on using this pattern, all other functions will be renamed similarly.

Copy link
Member

@tfiedor tfiedor Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the idea, but the naming is probably not ideal. I though of copy_trimmed() or clone_trimmed() to show that a copy of the original automaton is made.

I think, we should more carefully unify naming convetion (mainly) for: 1. member operations (like nfa.trim(), 2. non-member operation (like res = trim(nfa), 3. member operation returning clone/copy res = nfa.trim().

Copy link
Collaborator Author

@Adda0 Adda0 Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though of copy_trimmed() or clone_trimmed() to show that a copy of the original automaton is made.

I would prefer something like this, too, but others have to agree as well.

unify naming convetion

That is what I start doing here. This is an example of how it will look with trim(), other function will follow. The naming convention (as of now, without your suggestion) should be:

  1. trim()
  2. trimmed()
  3. trimmed()

However, a lot of the non-member operations should either disappear and be replaced with their member variants: determinize(nfa) to nfa.determinized(), etc. In some cases, especially for binary operations, both variants will exist: intersection(nfa, nfa2) and nfa.intersection_with(nfa2) plus the in-place variant may exist, too: concanation(nfa, nfa2), nfa.concatenate_with(nfa) (in-place) and nfa.concatenated_with(nfa2).

But this approach still has its faults. We should discuss this further, think about possible problems, ... For further discussion, see #172 and #277 (comment) (and in other discussions in this PR).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would refactor the trims first, we need to measure the speed and keep only the fastest, and then think about names. This verb /adjective naming system is not used consitently anyway, so it looks even weirder here.
Actually I confused it with revert, which I wanted to evaluate.
Here, lets just keep the in place trim and rename it then.

Copy link
Collaborator

@kilohsakul kilohsakul Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we can ad it in a separate pull request, lets merge this one quick.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be done. I will remove trimmed() for now.

src/nfa/nfa.cc Outdated Show resolved Hide resolved
Copy link
Collaborator

@kilohsakul kilohsakul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented few small things.

include/mata/alphabet.hh Show resolved Hide resolved
include/mata/nfa/nfa.hh Outdated Show resolved Hide resolved
* @param[out] state_renaming Mapping of trimmed states to new states.
* @return New @c Nfa instace from @c this with trimmed states.
*/
Nfa trimmed(StateRenaming* state_renaming = nullptr) const { return Nfa{ *this }.trim(state_renaming); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would refactor the trims first, we need to measure the speed and keep only the fastest, and then think about names. This verb /adjective naming system is not used consitently anyway, so it looks even weirder here.
Actually I confused it with revert, which I wanted to evaluate.
Here, lets just keep the in place trim and rename it then.

include/mata/nfa/strings.hh Outdated Show resolved Hide resolved
src/nfa/nfa.cc Outdated Show resolved Hide resolved
src/nfa/nfa.cc Outdated Show resolved Hide resolved
src/nfa/nfa.cc Outdated Show resolved Hide resolved
Nfa aut{20};
FILL_WITH_AUT_A(aut);
aut.delta.add(0, EPSILON, 3);
aut.delta.add(3, EPSILON, 3);
aut.delta.add(3, EPSILON, 4);

auto state_eps_trans{ aut.get_epsilon_transitions(0) };
auto state_eps_trans{ aut.delta.epsilon_symbol_posts(0) };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_symbol remove?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is not meant to be epsilon_symbol and posts, but epsilon and sybol_posts, as in, SymbolPost for epsilon from a given source. Therefore, the name should be kept as it is.

@@ -26,6 +26,8 @@
namespace Mata {

using Symbol = unsigned;
using Word = std::vector<Symbol>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thing having this one named is good, it appears on many places later.

* @param[out] state_renaming Mapping of trimmed states to new states.
* @return New @c Nfa instace from @c this with trimmed states.
*/
Nfa trimmed(StateRenaming* state_renaming = nullptr) const { return Nfa{ *this }.trim(state_renaming); }
Copy link
Collaborator

@kilohsakul kilohsakul Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we can ad it in a separate pull request, lets merge this one quick.

@Adda0 Adda0 force-pushed the refactor_types_and_names branch from d0b1e8b to a539cc8 Compare August 18, 2023 10:27
@Adda0 Adda0 merged commit 1b7128e into devel Aug 18, 2023
@Adda0 Adda0 deleted the refactor_types_and_names branch August 18, 2023 10:38
@Adda0
Copy link
Collaborator Author

Adda0 commented Aug 18, 2023

The changes from reviews that were not applied immediately will be applied in future PRs, namely in alphabet refactorization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants