-
-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding GNU C++ demangler #1834
base: main
Are you sure you want to change the base?
Adding GNU C++ demangler #1834
Conversation
WalkthroughThe pull request introduces the PEGTL (Parsing Expression Grammar Template Library) as a new third-party dependency for the PCSX-Redux project. This involves adding a Git submodule for PEGTL, updating build configurations across different platforms (Makefile and Visual Studio), and implementing a new GNU C++ name demangler using the PEGTL library. The changes include modifying project files to include the new library's headers, adding a new source and header file for the demangler, and updating the project's license documentation. Changes
Sequence DiagramsequenceDiagram
participant Caller
participant Demangler
participant PEGTL
Caller->>Demangler: demangle(mangled_name)
Demangler->>PEGTL: Parse mangled name
PEGTL-->>Demangler: Parsing result
alt Parsing Successful
Demangler-->>Caller: Demangled name
else Parsing Failed
Demangler-->>Caller: Original mangled name
end
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/support/gnu-c++-demangler.cc (1)
366-371
: Enhance error handling in thedemangle
function.Currently, if parsing fails, the function returns the original mangled name without any indication of the error. Consider logging the error or providing more detailed error information to assist with debugging.
src/support/gnu-c++-demangler.h (1)
36-38
: Add documentation for public functions.The functions
internalCheck()
,trace(std::string_view mangled)
, anddemangle(std::string_view mangled)
are part of the public interface. Adding documentation comments will improve readability and help other developers understand the purpose and usage of these functions.LICENSES.md (1)
56-56
: Fix markdown list indentation.The list item indentation is inconsistent with the rest of the file. Consider fixing the indentation to match other entries.
- - [PEGTL](https://github.com/taocpp/PEGTL) + - [PEGTL](https://github.com/taocpp/PEGTL)🧰 Tools
🪛 Markdownlint (0.37.0)
56-56: Expected: 0; Actual: 1
Unordered list indentation(MD007, ul-indent)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.gitmodules
(1 hunks)LICENSES.md
(1 hunks)Makefile
(1 hunks)src/support/gnu-c++-demangler.cc
(1 hunks)src/support/gnu-c++-demangler.h
(1 hunks)third_party/PEGTL
(1 hunks)vsprojects/common.props
(1 hunks)vsprojects/pcsx-redux.sln
(2 hunks)vsprojects/support/support.vcxproj
(2 hunks)vsprojects/support/support.vcxproj.filters
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- third_party/PEGTL
👮 Files not reviewed due to content moderation or server errors (1)
- vsprojects/support/support.vcxproj.filters
🧰 Additional context used
🪛 Markdownlint (0.37.0)
LICENSES.md
56-56: Expected: 0; Actual: 1
Unordered list indentation
(MD007, ul-indent)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (8)
vsprojects/common.props (1)
11-11
: Verify the correctness of the added include directory path.The path
$(SolutionDir)..\third_party\PEGTL\include;
has been added toAdditionalIncludeDirectories
. Please verify that this path is correct and accessible during the build process to ensure that the PEGTL headers are included properly.Makefile (1)
39-39
: LGTM!The PEGTL include path is correctly added and follows the existing pattern in the Makefile.
.gitmodules (1)
97-99
: LGTM! Clean addition of PEGTL dependencyThe addition of PEGTL as a submodule follows project conventions and uses the official repository.
vsprojects/support/support.vcxproj (2)
163-163
: LGTM! Header file addition follows conventionsThe gnu-c++-demangler.h header is properly integrated into the project structure.
213-215
: Verify if /bigobj is needed for other configurationsThe /bigobj compiler option is added only for Debug|x64. Consider if this option is also needed for other configurations to handle large object files generated by template-heavy PEGTL code.
✅ Verification successful
Keep /bigobj option specific to Debug|x64 configuration
The current configuration appears correct. The gnu-c++-demangler.cc file (20KB) is the largest in the support directory and likely generates large object files specifically in Debug builds due to template expansion and debug symbols. Release builds typically have better template instantiation optimization, explaining why the option isn't needed there.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if other configurations might need /bigobj by looking for similar patterns rg -l "bigobj" vsprojects/Length of output: 61
Script:
#!/bin/bash # Get more context around bigobj usage rg -C 5 "bigobj" vsprojects/support/support.vcxproj # Check for the specific file in other configurations rg -A 5 "gnu-c\+\+-demangler\.cc" vsprojects/support/support.vcxproj # Look for other potential template-heavy files fd . src/support -e cc -x ls -l {} \;Length of output: 2767
vsprojects/pcsx-redux.sln (3)
110-114
: LGTM! Well-structured PEGTL integrationThe PEGTL root folder is properly added under third_party with the main header.
780-785
: LGTM! Proper nesting of PEGTL componentsThe folder hierarchy correctly organizes PEGTL components (core, contrib, internal) maintaining clean project structure.
789-789
: Update SolutionGuid to maintain solution identityThe solution GUID has been updated, which is expected when adding new projects.
size *= 10; | ||
size += in.peek_char() - '0'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Check for potential integer overflow in size
calculation.
The size
variable is calculated by accumulating digit characters in lines 103-104. If the mangled name contains a very large number, there is a risk of integer overflow. Consider adding a check to prevent overflow or using a larger integer type like std::size_t
.
while (in.size() > 0 && std::isdigit(in.peek_char())) { | ||
size *= 10; | ||
size += in.peek_char() - '0'; | ||
in.bump(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cast char
to unsigned char
when using std::isdigit
.
In the source_name::match
function, std::isdigit(in.peek_char())
is called on line 102. Since std::isdigit
expects an int
that is representable as an unsigned char
or EOF, passing a char
directly can lead to undefined behavior if char
is signed and contains a negative value. To ensure correct behavior, cast the character to unsigned char
before passing it to std::isdigit
.
Apply this diff to fix the issue:
- while (in.size() > 0 && std::isdigit(in.peek_char())) {
+ while (in.size() > 0 && std::isdigit(static_cast<unsigned char>(in.peek_char()))) {
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
while (in.size() > 0 && std::isdigit(in.peek_char())) { | |
size *= 10; | |
size += in.peek_char() - '0'; | |
in.bump(1); | |
while (in.size() > 0 && std::isdigit(static_cast<unsigned char>(in.peek_char()))) { | |
size *= 10; | |
size += in.peek_char() - '0'; | |
in.bump(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (1)
src/support/gnu-c++-demangler.cc (1)
50-55
: Improve debugbreak implementation for non-Windows platforms.The non-Windows implementation of
debugbreak()
is empty. Consider:
- Using platform-specific debug break mechanisms (e.g.,
raise(SIGTRAP)
on Unix).- Adding a compile-time warning when debug breaks are unavailable.
#ifdef _WIN32 void debugbreak() { if (IsDebuggerPresent()) __debugbreak(); } #else void debugbreak() { #if defined(__unix__) || defined(__unix) || defined(__APPLE__) raise(SIGTRAP); #else #warning "Debug break not implemented for this platform" #endif } #endif
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/support/gnu-c++-demangler.cc
(1 hunks)src/support/gnu-c++-demangler.h
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: macos-intel-build-and-test
- GitHub Check: macos-arm-build-and-test
- GitHub Check: build
- GitHub Check: toolchain
- GitHub Check: asan
- GitHub Check: aur-build
- GitHub Check: macos-build-and-test-toolchain
- GitHub Check: cross-arm64
- GitHub Check: coverage
- GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (1)
src/support/gnu-c++-demangler.cc (1)
129-148
:⚠️ Potential issueFix potential issues in source_name implementation.
The
source_name
implementation has several issues:
- Integer overflow possible in size calculation
- Unsafe usage of
std::isdigit
- No bounds checking on string_view access
struct source_name { using rule_t = source_name; using subs_t = pegtl::nothing<void>; template <pegtl::apply_mode A, pegtl::rewind_mode M, template <typename...> class Action, template <typename...> class Control, typename ParseInput, typename... States> [[nodiscard]] static bool match(ParseInput& in, States&&... st) { auto marker = in.template mark<M>(); - unsigned size = 0; + std::size_t size = 0; + const std::size_t maxSize = std::numeric_limits<std::size_t>::max(); while (in.size() > 0 && std::isdigit(static_cast<unsigned char>(in.peek_char()))) { + if (size > maxSize / 10) return marker(false); // Prevent overflow size *= 10; - size += in.peek_char() - '0'; + std::size_t digit = in.peek_char() - '0'; + if (size > maxSize - digit) return marker(false); // Prevent overflow + size += digit; in.bump(1); } if (size == 0) return marker(false); if (in.size() < size) return marker(false); in.bump(size); return marker(true); } };Likely invalid or redundant comment.
src/support/gnu-c++-demangler.h
Outdated
bool internalCheck(); | ||
bool trace(std::string_view mangled); | ||
bool printDot(std::string_view mangled); | ||
std::string demangle(std::string_view mangled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add documentation for the public API.
The public functions lack documentation. Please add Doxygen-style comments explaining:
- Purpose of each function
- Parameters and their meaning
- Return value semantics
- Example usage if applicable
Example documentation format:
/**
* Performs internal validation of the demangler grammar.
* @return true if the grammar is valid, false otherwise.
*/
bool internalCheck();
/**
* Traces the demangling process for a mangled name.
* @param mangled The mangled name to trace
* @return true if tracing succeeded, false otherwise
*/
bool trace(std::string_view mangled);
|
||
MIT License | ||
|
||
Copyright (c) 2025 PCSX-Redux authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the copyright year.
The copyright year is set to 2025, which is in the future. Please update it to the current year (2024).
-Copyright (c) 2025 PCSX-Redux authors
+Copyright (c) 2024 PCSX-Redux authors
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Copyright (c) 2025 PCSX-Redux authors | |
Copyright (c) 2024 PCSX-Redux authors |
struct DemanglerState { | ||
std::string tostring() { return ""; } | ||
std::vector<std::string> bag; | ||
std::string_view lastName; | ||
const std::type_info* lastAction; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve DemanglerState implementation.
The DemanglerState
class needs improvements:
tostring()
returns an empty string but should format the state.- Member variables lack documentation.
- Consider making
tostring()
const-qualified.
struct DemanglerState {
/// Returns a string representation of the current state
std::string tostring() const {
std::string result;
if (!bag.empty()) {
result = "Names: ";
for (const auto& name : bag) {
result += name + ", ";
}
result.resize(result.size() - 2); // Remove trailing ", "
}
return result;
}
std::vector<std::string> bag; ///< Collected names during parsing
std::string_view lastName; ///< Most recently parsed name
const std::type_info* lastAction; ///< Type info of last executed action
};
debugbreak(); | ||
throw Unimplemented("local_name"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<data_member_prefix> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("data_member_prefix"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<seq_id> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("seq_id"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_simple> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_simple"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_std> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_std"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_std_allocator> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_std_allocator"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_std_basic_string> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_std_basic_string"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_std_basic_string_full> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_std_basic_string_full"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_std_basic_istream> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_std_basic_istream"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_std_basic_ostream> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_std_basic_ostream"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution_std_basic_iostream> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution_std_basic_iostream"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<substitution> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("substitution"); | ||
} | ||
}; | ||
template <> | ||
struct DemanglerAction<std_name> { | ||
template <typename ActionInput> | ||
static void apply(const ActionInput& in, DemanglerState& state) { | ||
debugbreak(); | ||
throw Unimplemented("std_name"); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in DemanglerAction implementations.
The current implementation has several issues:
- All actions throw exceptions, making error recovery impossible
- No progress tracking or partial results
- Debug breaks in every action make debugging cumbersome
Consider implementing a more robust error handling strategy:
- Use a Result type instead of exceptions:
template <typename Rule>
struct DemanglerAction : pegtl::nothing<Rule> {};
// Example improved implementation
template <>
struct DemanglerAction<mangled_name> {
template <typename ActionInput>
static void apply(const ActionInput& in, DemanglerState& state) {
if (state.errorOccurred) return; // Skip if already in error state
try {
// Implementation here
} catch (const std::exception& e) {
state.errorOccurred = true;
state.errorMessage = e.what();
state.errorLocation = in.position();
}
}
};
- Add progress tracking:
struct DemanglerState {
// ... existing members ...
/// Tracks parsing progress
struct Progress {
std::size_t totalRules = 0;
std::size_t completedRules = 0;
std::string currentRule;
} progress;
/// Error state
bool errorOccurred = false;
std::string errorMessage;
pegtl::position errorLocation;
};
std::string PCSX::GNUDemangler::demangle(std::string_view mangled) { | ||
try { | ||
DemanglerState state; | ||
pegtl::string_input in(mangled, "mangled"); | ||
auto result = pegtl::parse<mangled_name, DemanglerAction>(in, state); | ||
if (result) state.tostring(); | ||
} catch (Unimplemented unimplemented) { | ||
std::cout << unimplemented.name; | ||
} catch (...) { | ||
} | ||
return std::string(mangled); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in demangle function.
The current implementation has several issues:
- Silently catches all exceptions
- Only prints unimplemented rule names
- Always returns the original mangled name on failure
Consider implementing proper error reporting:
struct DemangleError {
enum class Type {
Unimplemented,
InvalidInput,
ParseError,
Internal
};
Type type;
std::string message;
std::optional<std::string> ruleName;
};
std::variant<std::string, DemangleError> demangle(std::string_view mangled) {
try {
DemanglerState state;
pegtl::string_input in(mangled, "mangled");
if (!pegtl::parse<mangled_name, DemanglerAction>(in, state)) {
return DemangleError{
DemangleError::Type::ParseError,
"Failed to parse mangled name",
std::nullopt
};
}
return state.tostring();
} catch (const Unimplemented& e) {
return DemangleError{
DemangleError::Type::Unimplemented,
"Unimplemented demangling rule",
std::string(e.name)
};
} catch (const std::exception& e) {
return DemangleError{
DemangleError::Type::Internal,
e.what(),
std::nullopt
};
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/support/gnu-c++-demangler.cc (2)
5-5
: Update the copyright year.The copyright year is set to 2025, which is in the future. Consider updating it to the current year.
-Copyright (c) 2025 PCSX-Redux authors +Copyright (c) 2024 PCSX-Redux authors
50-55
: Enhance debug break for non-Windows platforms.The non-Windows version of
debugbreak()
is empty. Consider using platform-specific debug break mechanisms for better debugging support.#else -void debugbreak() {} +void debugbreak() { +#if defined(__APPLE__) + __builtin_debugtrap(); +#elif defined(__unix__) + raise(SIGTRAP); +#else + // Fall back to empty implementation +#endif +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/support/gnu-c++-demangler.cc
(1 hunks)src/support/gnu-c++-demangler.h
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/support/gnu-c++-demangler.h
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: macos-intel-build-and-test
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: build
- GitHub Check: asan
- GitHub Check: macos-build-and-test-toolchain
- GitHub Check: toolchain
- GitHub Check: coverage
- GitHub Check: aur-build
- GitHub Check: cross-arm64
🔇 Additional comments (5)
src/support/gnu-c++-demangler.cc (5)
58-63
: Improve DemanglerState implementation.The current implementation has several issues:
tostring()
returns an empty string but is used indemangle()
- Member variables lack documentation
- Consider making
tostring()
const-qualified
138-140
: Check for potential integer overflow insize
calculation.The
size
variable is calculated by accumulating digit characters. If the mangled name contains a very large number, there is a risk of integer overflow.
138-141
: Castchar
tounsigned char
when usingstd::isdigit
.Using
std::isdigit
withchar
can lead to undefined behavior ifchar
is signed and contains a negative value.
382-1870
: Improve error handling in DemanglerAction implementations.The current implementation has several issues:
- All actions throw exceptions, making error recovery impossible
- No progress tracking or partial results
- Debug breaks in every action make debugging cumbersome
1894-1905
: Improve error handling in demangle function.The current implementation has several issues:
- Silently catches all exceptions
- Only prints unimplemented rule names
- Always returns the original mangled name on failure
No description provided.