-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Remove redundant this when referring to member variables. #60
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Lin Zhihao <[email protected]>
…eterministic. This is fine as tags are unique across disjoint subtrees.
Co-authored-by: Lin Zhihao <[email protected]>
…og-surgeon into individual-files
…ypedefs to the top of the file to fix compilation error.
…pp files; Also remove RegexDFA.tpp file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (9)
src/log_surgeon/Lalr1Parser.tpp (5)
94-96
: Simplify theadd_token_chain
function by refactoringIn lines 94-96 of the
add_token_chain
function, constructing therule_chain
manually for each character can be error-prone and hard to maintain. Consider refactoring the code to build the chain using a loop, which will make it more scalable and readable.Here's an example of how you might refactor the code:
std::unique_ptr<finite_automata::RegexAST<TypedNfaState>> rule_chain = std::make_unique<finite_automata::RegexASTLiteral<TypedNfaState>>(chain[0]); for (uint32_t i = 1; i < chain.size(); i++) { std::unique_ptr<finite_automata::RegexASTLiteral<TypedNfaState>> next_char_rule = std::make_unique<finite_automata::RegexASTLiteral<TypedNfaState>>(chain[i]); rule_chain = std::make_unique<finite_automata::RegexASTCat<TypedNfaState>>( std::move(rule_chain), std::move(next_char_rule) ); }
196-196
: Usefalse ==
instead of!
according to coding guidelinesIn line 196, you have
if (!item_set_ptr->m_closure.insert(*item).second)
. As per the coding guidelines, preferfalse == <expression>
rather than!<expression>
.Apply this diff:
- if (!item_set_ptr->m_closure.insert(*item).second) { + if (false == item_set_ptr->m_closure.insert(*item).second) {
Line range hint
223-223
: Replacethis->m_lexer
withm_lexer
to follow the new standardIn line 223, you use
this->m_lexer.m_symbol_id
. Replacethis->m_lexer
withm_lexer
to align with the member variable naming convention.Apply this diff:
- if (this->m_lexer.m_symbol_id.find(head) == this->m_lexer.m_symbol_id.end()) { + if (m_lexer.m_symbol_id.find(head) == m_lexer.m_symbol_id.end()) {
633-633
: Preferfalse == accept
over!accept
In line 633, you have
if (!accept)
. According to the coding guidelines, prefer usingfalse == accept
instead.Apply this diff:
- if (!accept) { + if (false == accept) {
698-698
: Usefalse == is_accepting
instead of!is_accepting
In line 698, within the lambda function, replace
if (!is_accepting)
withif (false == is_accepting)
as per the coding guidelines.Apply this diff:
- if (!is_accepting) { + if (false == is_accepting) {src/log_surgeon/Lalr1Parser.cpp (1)
11-14
: Update member variable initialization to match new naming conventionsIn lines 11-14, in the
NonTerminal
constructor, you can remove unnecessary qualifiers and adhere to the newm_
member variable notation.Apply this diff:
-NonTerminal::NonTerminal(Production* p) - : m_children_start(m_next_children_start), - m_production(p), - m_ast(nullptr) { +NonTerminal::NonTerminal(Production* p) + : m_children_start(m_next_children_start), + m_production(p), + m_ast(nullptr) { m_next_children_start += p->m_body.size(); }src/log_surgeon/finite_automata/Dfa.hpp (1)
54-71
: Provide clarification on UTF-8 handling inget_intersect
methodIn lines 54-71, the
get_intersect
method mentions handling UTF-8 transitions but currently only supports byte transitions. Consider implementing UTF-8 handling or clarifying the comment.Apply this diff to update the TODO comment:
- // TODO: Handle UTF-8 (multi-byte transitions) as well + // Note: Currently, this method handles only byte transitions. UTF-8 support is not implemented yet.src/log_surgeon/finite_automata/Nfa.hpp (2)
151-151
: Consider using a more descriptive lambda parameter name.The lambda parameter name
dest_state
could be more descriptive, such astarget_state
ornext_state
, to better convey its role in the BFS traversal.- = [&state_queue, &visited_states](TypedNfaState const* dest_state) { + = [&state_queue, &visited_states](TypedNfaState const* target_state) { - if (visited_states.insert(dest_state).second) { - state_queue.push(dest_state); + if (visited_states.insert(target_state).second) { + state_queue.push(target_state); } };
Line range hint
165-167
: TODO comment needs attention.The TODO comment about handling the UTF8 case should be addressed to ensure proper support for UTF-8 characters.
Would you like me to help implement the UTF-8 case handling or create a GitHub issue to track this task?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (27)
CMakeLists.txt
(2 hunks)examples/intersect-test.cpp
(4 hunks)src/log_surgeon/BufferParser.hpp
(2 hunks)src/log_surgeon/Lalr1Parser.cpp
(2 hunks)src/log_surgeon/Lalr1Parser.hpp
(7 hunks)src/log_surgeon/Lalr1Parser.tpp
(20 hunks)src/log_surgeon/Lexer.hpp
(6 hunks)src/log_surgeon/Lexer.tpp
(18 hunks)src/log_surgeon/LexicalRule.hpp
(2 hunks)src/log_surgeon/LogParser.cpp
(5 hunks)src/log_surgeon/LogParser.hpp
(4 hunks)src/log_surgeon/Parser.hpp
(1 hunks)src/log_surgeon/Parser.tpp
(3 hunks)src/log_surgeon/ReaderParser.hpp
(2 hunks)src/log_surgeon/SchemaParser.cpp
(1 hunks)src/log_surgeon/SchemaParser.hpp
(4 hunks)src/log_surgeon/finite_automata/Dfa.hpp
(1 hunks)src/log_surgeon/finite_automata/DfaState.hpp
(1 hunks)src/log_surgeon/finite_automata/DfaStatePair.hpp
(1 hunks)src/log_surgeon/finite_automata/DfaStateType.hpp
(1 hunks)src/log_surgeon/finite_automata/Nfa.hpp
(6 hunks)src/log_surgeon/finite_automata/NfaState.hpp
(6 hunks)src/log_surgeon/finite_automata/RegexAST.hpp
(29 hunks)src/log_surgeon/finite_automata/TaggedTransition.hpp
(5 hunks)tests/CMakeLists.txt
(1 hunks)tests/test-NFA.cpp
(1 hunks)tests/test-lexer.cpp
(2 hunks)
✅ Files skipped from review due to trivial changes (2)
- src/log_surgeon/ReaderParser.hpp
- src/log_surgeon/BufferParser.hpp
🚧 Files skipped from review as they are similar to previous changes (15)
- src/log_surgeon/finite_automata/DfaStateType.hpp
- src/log_surgeon/LogParser.hpp
- src/log_surgeon/Parser.tpp
- src/log_surgeon/finite_automata/DfaStatePair.hpp
- tests/CMakeLists.txt
- examples/intersect-test.cpp
- src/log_surgeon/Parser.hpp
- CMakeLists.txt
- src/log_surgeon/LexicalRule.hpp
- src/log_surgeon/finite_automata/TaggedTransition.hpp
- tests/test-NFA.cpp
- src/log_surgeon/LogParser.cpp
- tests/test-lexer.cpp
- src/log_surgeon/Lexer.tpp
- src/log_surgeon/finite_automata/RegexAST.hpp
🧰 Additional context used
📓 Path-based instructions (9)
src/log_surgeon/Lalr1Parser.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/Dfa.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/DfaState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/SchemaParser.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/Lalr1Parser.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/Lexer.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/SchemaParser.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/Nfa.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (2)
src/log_surgeon/finite_automata/NfaState.hpp (3)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/Nfa.hpp (4)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:99-105
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In `src/log_surgeon/finite_automata/RegexNFA.hpp`, it's acceptable to have constructors without the `explicit` specifier. Do not suggest adding `explicit` to constructors in this file.
🔇 Additional comments (16)
src/log_surgeon/Lalr1Parser.tpp (3)
73-73
: 🛠️ Refactor suggestion
Replace this->m_lexer
with m_lexer
to adhere to the new member variable convention
In line 73, you use this->m_lexer.m_symbol_id[name]
. As per the new standard of using the m_
prefix for member variables, please replace this->m_lexer
with m_lexer
.
Apply this diff:
- m_terminals.insert(this->m_lexer.m_symbol_id[name]);
+ m_terminals.insert(m_lexer.m_symbol_id[name]);
Likely invalid or redundant comment.
667-668
: 🛠️ Refactor suggestion
Change this->m_lexer
to m_lexer
when calling scan
In line 668, replace this->m_lexer.scan(...)
with m_lexer.scan(...)
to follow the updated convention.
Apply this diff:
- if (ErrorCode error = this->m_lexer.scan(m_input_buffer, token);
+ if (ErrorCode error = m_lexer.scan(m_input_buffer, token);
Likely invalid or redundant comment.
654-655
: 🛠️ Refactor suggestion
Update this->m_lexer
to m_lexer
in the reset
method
In line 655, you have this->m_lexer.reset();
. Please change it to m_lexer.reset();
to adhere to the new member variable notation.
Apply this diff:
- this->m_lexer.reset();
+ m_lexer.reset();
Likely invalid or redundant comment.
src/log_surgeon/finite_automata/DfaState.hpp (1)
62-63
:
Add bounds checking in the next
method to prevent out-of-bounds access
When accessing m_bytes_transition[character]
in line 63, it's important to ensure that character
is less than cSizeOfByte
to prevent out-of-bounds access.
Refer to the previous review comment addressing this issue.
src/log_surgeon/SchemaParser.hpp (1)
49-49
: LGTM! Type updates are consistent.
The changes align with the broader refactoring effort to standardize type names across the codebase. The updates to the constructor parameter, member variable type, and class inheritance maintain consistency with the new naming convention.
Also applies to: 58-58, 71-72
src/log_surgeon/Lexer.hpp (2)
14-16
: LGTM! Include paths updated correctly.
The include paths have been updated to reflect the new file names, maintaining consistency with the refactoring changes.
23-23
: LGTM! Type updates are comprehensive and consistent.
The changes systematically update all references from the old naming convention to the new one:
- Template parameters from NFAStateType/DFAStateType to TypedNfaState/TypedDfaState
- Method signatures and return types
- Member variable types
- Type aliases in the lexers namespace
Also applies to: 32-36, 49-52, 59-59, 129-129, 159-159, 162-162, 164-164, 168-169
src/log_surgeon/finite_automata/Nfa.hpp (2)
1-2
: LGTM! File-level updates are correct.
The header guards and include paths have been properly updated to reflect the new file name.
Also applies to: 16-16
20-21
: LGTM! Class definition updates are consistent.
The class has been renamed from RegexNFA to Nfa, and the template parameter has been updated to TypedNfaState, maintaining consistency with the new naming convention.
Also applies to: 23-25
src/log_surgeon/finite_automata/NfaState.hpp (3)
130-130
: Well-structured constant comparison!
The condition NfaStateType::Utf8 == state_type
follows the coding guideline of placing the constant on the left side of the comparison, which prevents accidental assignments.
Line range hint 191-191
: Consistent coding style in boolean comparisons.
The expression false == optional_serialized_positive_start_transition.has_value()
follows the coding guideline of using explicit comparisons with boolean literals on the left.
110-114
: Well-structured member variable declarations.
The member variables consistently use the m_
prefix and are logically grouped by their purpose (transitions, states, etc.).
src/log_surgeon/Lalr1Parser.hpp (2)
161-161
: Clean and concise method implementations.
The methods have been simplified by removing redundant this->
qualifiers while maintaining clarity:
has_dot_at_end()
uses a clear comparisonnext_symbol()
directly accesses member variables
Also applies to: 168-168
203-204
: Well-structured template parameters.
The template parameters have been renamed from NFAStateType
and DFAStateType
to more descriptive TypedNfaState
and TypedDfaState
, improving code readability.
src/log_surgeon/SchemaParser.cpp (2)
19-37
: Consistent type alias updates.
The type aliases have been updated to use ByteNfaState
consistently throughout the file, improving code clarity and maintainability. The changes align well with the new naming conventions established in the NFA implementation.
14-14
: Updated include statement.
The include statement has been updated to use the new filename Lalr1Parser.hpp
, maintaining consistency with the renamed file.
References
Description
Old standard in this repository was using
this->var
to refer to member variables in classes. We switched the standard to usem_var
. This PR updates previous code to meet this standard.using Parser<TypedNfaState, TypedDfaState>::m_lexer;
added to the top ofLalr1Parser
variables in thePrivate
section. This allowsthis->m_lexer
of theParser
parent class to be used asm_lexer
inLalr1Parser
.Validation performed
Previously existing tests still succeed.
Summary by CodeRabbit
Release Notes
New Features
PrefixTree
class for managing prefix tree structures in register data.RegisterHandler
class for managing registers using a prefix tree.Dfa
andNfa
classes for deterministic and non-deterministic finite automata.Improvements
RegexNFA
toNfa
andRegexDFA
toDfa
.Testing
PrefixTree
andRegisterHandler
to validate functionality and error handling.These changes enhance the application's structure, improve readability, and ensure that new features are thoroughly tested.