-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split StreamReader
into an interface and implementation to prepare for adding another IR stream reader.
#26
Conversation
WalkthroughThe pull request introduces significant changes to the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (5)
src/clp_ffi_js/ir/decoding_methods.hpp (1)
9-9
: Add documentation for the get_version function.As this is part of the public API, please add documentation explaining:
- The function's purpose
- Expected return value format
- Any exceptions that might be thrown
- Any preconditions or side effects
src/clp_ffi_js/ir/StreamReaderDataContext.hpp (1)
Line range hint
18-18
: Class name needs to be updated to match the refactoringThe class name should be
IrStreamReaderDataContext
to maintain consistency with the renaming changes mentioned in the PR objectives and summary. This aligns with the pattern of renamingStreamReader
toIrStreamReader
.Apply this diff to update the class name:
-class StreamReaderDataContext { +class IrStreamReaderDataContext {src/clp_ffi_js/ir/decoding_methods.cpp (2)
3-8
: Consider reorganizing include directivesThe includes should be grouped into three distinct blocks separated by blank lines:
- System headers (, , etc.)
- Third-party headers (nlohmann/json)
- Project headers (clp/*)
+// System headers #include <cstdint> -#include <json/single_include/nlohmann/json.hpp> #include <string> #include <string_view> #include <vector> +// Third-party headers +#include <json/single_include/nlohmann/json.hpp> +#include <spdlog/spdlog.h> + +// Project headers #include <clp/ErrorCode.hpp>Also applies to: 11-11, 14-14
24-41
: Enhance error reporting for preamble deserializationThe error handling for preamble deserialization could be more informative.
if (clp::ffi::ir_stream::IRErrorCode::IRErrorCode_Success != deserialize_preamble_result) { SPDLOG_CRITICAL( - "Failed to deserialize preamble for version reading: {}", + "Failed to deserialize IR stream preamble: error_code={}, metadata_size={}", - deserialize_preamble_result + deserialize_preamble_result, + metadata_bytes.size() ); throw ClpFfiJsException{ clp::ErrorCode::ErrorCode_Failure, __FILENAME__, __LINE__, - "Failed to deserialize preamble for version reading." + "Failed to deserialize IR stream preamble. See logs for details." }; }src/clp_ffi_js/ir/IrStreamReader.cpp (1)
193-201
: Ensure safe handling of message content inEM_ASM
When passing
message.c_str()
toEM_ASM
, there's a potential risk if the message contains unexpected characters or is not properly null-terminated. To enhance security and stability, consider validating or sanitizing the message before passing it to JavaScript.You might sanitize the message or ensure it meets expected format constraints before usage.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
src/clp_ffi_js/ir/IrStreamReader.cpp
(1 hunks)src/clp_ffi_js/ir/IrStreamReader.hpp
(1 hunks)src/clp_ffi_js/ir/StreamReader.cpp
(2 hunks)src/clp_ffi_js/ir/StreamReader.hpp
(3 hunks)src/clp_ffi_js/ir/StreamReaderDataContext.hpp
(1 hunks)src/clp_ffi_js/ir/decoding_methods.cpp
(1 hunks)src/clp_ffi_js/ir/decoding_methods.hpp
(1 hunks)
🔇 Additional comments (18)
src/clp_ffi_js/ir/decoding_methods.hpp (3)
1-7
: LGTM! Header organization is clean and follows best practices.
The header guards and includes are well-organized and contain only the necessary dependencies.
8-8
: LGTM! Namespace declaration is appropriate.
The nested namespace follows modern C++ conventions and matches the project structure.
Line range hint 10-17
: Consider separating the rewind and validation responsibilities.
The function name suggests it performs two distinct operations: rewinding and validation. Consider splitting these into separate functions to improve modularity and reusability:
rewind_reader()
validate_encoding_type()
This would better follow the Single Responsibility Principle and make the code more maintainable.
Additionally, since this is a work in progress, we should verify the exception handling across the codebase:
src/clp_ffi_js/ir/StreamReaderDataContext.hpp (2)
14-14
: Documentation update looks good, eh!
The documentation has been properly updated to reference IrStreamReader
instead of StreamReader
.
Line range hint 1-57
: Verify the impact of class renaming
The class implementation looks solid with proper memory management and RAII principles. However, we should verify that all references to this class are updated consistently.
Let's check for any references to this class that need updating:
✅ Verification successful
Let me check the content of IrStreamReader files to ensure the class name is used consistently and there are no mismatches.
No issues found with StreamReaderDataContext class references
After reviewing the implementation files, all references to StreamReaderDataContext
are consistent and properly used throughout the codebase. The class is:
- Correctly included in IrStreamReader.hpp
- Properly used as a template class with
four_byte_encoded_variable_t
- Consistently used in both the header and implementation files
- Appropriately managed with smart pointers and move semantics
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all references to StreamReaderDataContext that need updating
# Search for direct references to StreamReaderDataContext
echo "Checking for direct references to StreamReaderDataContext:"
rg -l "StreamReaderDataContext"
# Search for template usage patterns
echo -e "\nChecking for template usage patterns:"
rg "StreamReaderDataContext<"
Length of output: 1083
Script:
#!/bin/bash
# Check the content of IrStreamReader files
echo "Content of IrStreamReader.hpp:"
cat src/clp_ffi_js/ir/IrStreamReader.hpp
echo -e "\nContent of IrStreamReader.cpp:"
cat src/clp_ffi_js/ir/IrStreamReader.cpp
Length of output: 14360
src/clp_ffi_js/ir/decoding_methods.cpp (2)
Line range hint 57-82
: LGTM! Existing implementation is robust
The function properly handles reader positioning and encoding type validation with appropriate error handling.
20-55
: 🛠️ Refactor suggestion
Add defensive JSON validation and improve error handling
The implementation looks solid but could benefit from some improvements:
Consider these improvements:
- Add JSON structure validation:
nlohmann::json const metadata = nlohmann::json::parse(metadata_view);
+ if (!metadata.is_object()) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_MetadataCorrupted,
+ __FILENAME__,
+ __LINE__,
+ "Metadata is not a valid JSON object."
+ };
+ }
+ if (!metadata.contains(clp::ffi::ir_stream::cProtocol::Metadata::VersionKey)) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_MetadataCorrupted,
+ __FILENAME__,
+ __LINE__,
+ "Version information missing from metadata."
+ };
+ }
- Add more detailed logging:
- SPDLOG_INFO("The version is {}", version);
+ SPDLOG_INFO("Successfully extracted IR stream version: {}", version);
src/clp_ffi_js/ir/StreamReader.cpp (2)
6-6
: Ensure C++20 support for <format>
and std::format
.
The inclusion of <format>
and the usage of std::format
in line 51 rely on C++20 features. Please verify that the project's build configuration and compilers are set to support C++20 to prevent potential compilation issues.
Also applies to: 51-51
64-68
: Confirm class binding consistency in EMSCRIPTEN bindings.
In the EMSCRIPTEN_BINDINGS
section, the class is registered as ClpStreamReader
while binding clp_ffi_js::ir::StreamReader
. Since there's an introduction of IrStreamReader
, please ensure that the class binding accurately reflects the intended class hierarchy and that the JavaScript interface is correctly updated.
src/clp_ffi_js/ir/StreamReader.hpp (4)
30-31
: Use std::unique_ptr
for proper memory management in create
method
Returning a std::unique_ptr<StreamReader>
from the static create
method enhances memory safety by ensuring exclusive ownership and automatic resource cleanup.
34-34
: Declare virtual destructor in base class
Adding a virtual destructor to the StreamReader
base class ensures that destructors of derived classes are called correctly, preventing potential resource leaks.
49-49
: Convert member functions to pure virtual
Declaring methods as pure virtual functions (= 0
) enforces that derived classes implement these interfaces, promoting polymorphism and ensuring consistent behaviour across implementations.
Also applies to: 54-54, 61-61, 69-69, 86-87
89-90
: Make constructor protected to prevent direct instantiation
Moving the constructor to the protected section prevents direct instantiation of the StreamReader
class, indicating its role as an abstract base class intended for inheritance.
src/clp_ffi_js/ir/IrStreamReader.hpp (5)
1-2
: Header guard is correctly defined
The header guard CLP_FFI_JS_IR_IR_STREAM_READER_HPP
is properly set up to prevent multiple inclusions.
27-27
: Verify the necessity of 'friend StreamReader;' declaration
The friend StreamReader;
declaration may not be necessary unless StreamReader
requires access to the private members of IrStreamReader
. Consider removing it if it's not needed.
45-56
: Properly managing copy and move semantics
Disabling the copy constructor and copy assignment operator, while correctly handling the move constructor and deleting the move assignment operator, ensures proper resource management.
103-105
: Efficient use of move semantics in constructor
Utilizing move semantics in the constructor to initialize stream_reader_data_context
optimizes resource allocation.
113-118
: Appropriate management of member variables
The use of smart pointers and standard containers for member variables ensures proper memory management and exception safety.
src/clp_ffi_js/ir/IrStreamReader.hpp
Outdated
* | ||
* @param log_level_filter Array of selected log levels | ||
*/ | ||
void filter_log_events(emscripten::val const& log_level_filter) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Ensure appropriate passing of 'emscripten::val'
Consider passing emscripten::val
by value rather than by const reference in filter_log_events
. Since emscripten::val
is a lightweight handle to a JavaScript value, passing it by value might be more suitable and aligns with common practices.
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
.constructor( | ||
&clp_ffi_js::ir::IrStreamReader::create, | ||
emscripten::return_value_policy::take_ownership() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use return_value_policy::move
for efficiency
In the Emscripten bindings, the constructor uses return_value_policy::take_ownership
, which is typically used for pointers. Since IrStreamReader::create
returns by value and supports move semantics, it's more appropriate to use return_value_policy::move
to avoid unnecessary copies.
Apply this diff to adjust the return value policy:
.constructor(
&clp_ffi_js::ir::IrStreamReader::create,
- emscripten::return_value_policy::take_ownership()
+ emscripten::return_value_policy::move
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
.constructor( | |
&clp_ffi_js::ir::IrStreamReader::create, | |
emscripten::return_value_policy::take_ownership() | |
) | |
.constructor( | |
&clp_ffi_js::ir::IrStreamReader::create, | |
emscripten::return_value_policy::move | |
) |
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
|
||
auto const parsed{log_event.get_message().decode_and_unparse()}; | ||
if (false == parsed.has_value()) { | ||
SPDLOG_ERROR("Failed to decode message."); | ||
break; | ||
} | ||
message = parsed.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle message decoding failures explicitly
In the decode_range
method, when message decoding fails, the code logs an error and breaks the loop. This may result in partial results without informing the caller of the failure. Consider throwing an exception to notify the caller about the decoding issue.
Apply this diff to throw an exception upon decoding failure:
auto const parsed{log_event.get_message().decode_and_unparse()};
if (false == parsed.has_value()) {
SPDLOG_ERROR("Failed to decode message.");
- break;
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_Corrupt,
+ __FILENAME__,
+ __LINE__,
+ "Failed to decode message"
+ };
}
message = parsed.value();
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
auto const parsed{log_event.get_message().decode_and_unparse()}; | |
if (false == parsed.has_value()) { | |
SPDLOG_ERROR("Failed to decode message."); | |
break; | |
} | |
message = parsed.value(); | |
auto const parsed{log_event.get_message().decode_and_unparse()}; | |
if (false == parsed.has_value()) { | |
SPDLOG_ERROR("Failed to decode message."); | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Corrupt, | |
__FILENAME__, | |
__LINE__, | |
"Failed to decode message" | |
}; | |
} | |
message = parsed.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add this. Messages decoding is not expected to fail unless the file has been corrupted / the deserializer in CLP core does not handle edge cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, understood. Throwing an exception upon message decoding failure is appropriate here since message decoding failures are not expected unless the file has been corrupted or the deserializer in CLP core does not handle edge cases. I'll note this for future reference.
✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:162-168
Timestamp: 2024-11-04T19:55:26.917Z
Learning: In the `IrStreamReader::decode_range` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, message decoding failures are not expected unless the file is corrupted or the deserializer in CLP core does not handle edge cases, so it's appropriate to throw an exception upon message decoding failure.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
auto filter_levels{emscripten::vecFromJSArray<std::underlying_type_t<LogLevel>>(log_level_filter | ||
)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for invalid log_level_filter
input
Currently, the function assumes that log_level_filter
is a valid array. To prevent potential errors if a non-array is passed from JavaScript, consider adding a type check and throwing an exception or returning an error if the input is invalid.
Apply this diff to enhance input validation:
void IrStreamReader::filter_log_events(emscripten::val const& log_level_filter) {
if (log_level_filter.isNull()) {
m_filtered_log_event_map.reset();
return;
}
+ if (!log_level_filter.isArray()) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_BadParam,
+ __FILENAME__,
+ __LINE__,
+ "Expected an array for log_level_filter"
+ };
+ }
m_filtered_log_event_map.emplace();
auto filter_levels{emscripten::vecFromJSArray<std::underlying_type_t<LogLevel>>(log_level_filter
)};
Committable suggestion was skipped due to low confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now this is skippable since the wrapper code is typed with TypeScript and any usage with non-array would fail static checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, understood. Since the wrapper code is typed with TypeScript and non-array inputs would fail static checks, additional error handling here is not necessary.
✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.367Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually sorry for my oversight, I think we want to define a type (similar to FilteredLogEventMapTsType
) instead of directly using emscripten::val
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, thank you for the clarification. Defining a specific type for log_level_filter
will enhance type safety and improve the code's readability. Here's how you might define a type similar to FilteredLogEventMapTsType
:
using LogLevelFilterTsType = emscripten::val; // Define a type representing the log level filter
void IrStreamReader::filter_log_events(LogLevelFilterTsType const& log_level_filter) {
// Implementation remains the same
}
This approach makes the expected input type explicit in the method signature and aligns with the existing type definitions.
Since this updates our previous understanding and may contradict the earlier learning (ID: 940b05fa-811e-47b0-98a2-bfbbe962ccf3) regarding the unnecessary runtime error handling for log_level_filter
, would you like me to remove this learning? You can also manage learnings directly using the CodeRabbit UI learnings page.
🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (7)
src/clp_ffi_js/ir/decoding_methods.cpp (1)
3-8
: Consider reorganizing includes for better readabilityConsider grouping includes into standard library, third-party libraries, and project-specific headers, separated by newlines:
-#include <cstdint> -#include <json/single_include/nlohmann/json.hpp> -#include <string> -#include <string_view> -#include <vector> -#include <clp/ErrorCode.hpp> -#include <clp/ffi/ir_stream/decoding_methods.hpp> -#include <clp/ffi/ir_stream/protocol_constants.hpp> -#include <clp/ReaderInterface.hpp> -#include <clp/TraceableException.hpp> -#include <clp/type_utils.hpp> -#include <spdlog/spdlog.h> +// Standard library +#include <cstdint> +#include <string> +#include <string_view> +#include <vector> + +// Third-party libraries +#include <json/single_include/nlohmann/json.hpp> +#include <spdlog/spdlog.h> + +// Project headers +#include <clp/ErrorCode.hpp> +#include <clp/ReaderInterface.hpp> +#include <clp/TraceableException.hpp> +#include <clp/ffi/ir_stream/decoding_methods.hpp> +#include <clp/ffi/ir_stream/protocol_constants.hpp> +#include <clp/type_utils.hpp>Also applies to: 11-11, 14-14
src/clp_ffi_js/ir/StreamReader.hpp (3)
Line range hint
41-42
: Document the implementation dependency more formallyThe comment about LogEventDeserializer creates tight coupling. Consider documenting this relationship more formally in the class documentation.
- // Delete move assignment operator since it's also disabled in `clp::ir::LogEventDeserializer`. + /** + * Move assignment is disabled due to the class's relationship with LogEventDeserializer. + * @note This class follows LogEventDeserializer's contract which disallows move assignment. + */
59-59
: Enhance documentation for error handling behaviourThese pure virtual methods should document their error handling behaviour:
- What exceptions might be thrown?
- What are the preconditions and postconditions?
- virtual void filter_log_events(emscripten::val const& log_level_filter) = 0; + /** + * Generates a filtered collection from all log events. + * @param log_level_filter Array of selected log levels + * @throws std::invalid_argument If the filter is malformed + * @post Filtered collection is updated to reflect the new filter + */ + virtual void filter_log_events(emscripten::val const& log_level_filter) = 0; - [[nodiscard]] virtual auto deserialize_stream() -> size_t = 0; + /** + * Deserializes all log events in the stream. + * @return The number of successfully deserialized ("valid") log events + * @throws std::runtime_error If the stream is corrupted or invalid + * @post The stream is deallocated after processing + */ + [[nodiscard]] virtual auto deserialize_stream() -> size_t = 0;Also applies to: 67-67
28-29
: Consider version handling strategyGiven that the PR objectives mention adding version checking in the factory method, consider:
- Documenting the version compatibility requirements in the class documentation
- Adding version-related methods to the interface
- Implementing version validation in the factory method
This will help ensure proper version handling across different stream formats.
Also applies to: 84-85
src/clp_ffi_js/ir/IrStreamReader.hpp (3)
1-18
: Consider adding comprehensive file-level documentation.While the includes are well-organized, it would be beneficial to add file-level documentation explaining:
- The purpose and scope of this header file
- The relationship between
IrStreamReader
andStreamReader
- Any threading considerations or requirements
74-90
: Enhance decode_range documentation for error handling.The documentation should clarify:
- What specific error conditions might cause a null return
- Whether any exceptions might be thrown
- The behaviour when begin_idx > end_idx
99-110
: Add documentation for private members.Consider adding:
- Documentation for create_data_context explaining ownership transfer of parameters
- Documentation for each member variable explaining their purpose and lifecycle
- Any invariants maintained by these members
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
CMakeLists.txt
(1 hunks)src/clp_ffi_js/ir/IrStreamReader.cpp
(1 hunks)src/clp_ffi_js/ir/IrStreamReader.hpp
(1 hunks)src/clp_ffi_js/ir/StreamReader.cpp
(2 hunks)src/clp_ffi_js/ir/StreamReader.hpp
(3 hunks)src/clp_ffi_js/ir/decoding_methods.cpp
(2 hunks)src/clp_ffi_js/ir/decoding_methods.hpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/clp_ffi_js/ir/decoding_methods.hpp
🔇 Additional comments (11)
src/clp_ffi_js/ir/decoding_methods.cpp (2)
Line range hint 20-43
: LGTM! Well-structured error handling
The function properly validates the encoding type with appropriate error handling and clear error messages.
45-80
: Verify version handling across the codebase
Let's verify how the version information is used elsewhere in the codebase to ensure compatibility.
✅ Verification successful
Let me gather more specific information about how the version is used in these files.
Version handling implementation is properly constrained
The version handling is well-defined and properly implemented:
- The
get_version
function is used only inStreamReader.cpp
- Version validation explicitly checks for specific versions: "v0.0.1", "v0.0.0", "0.0.1", "0.0.0"
- Appropriate error handling is in place for unsupported versions
- No other parts of the codebase make assumptions about version format
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for version-related code patterns
echo "Searching for version usage patterns..."
rg -l "VersionKey|get_version" --type cpp
echo "Searching for version format validation..."
ast-grep --pattern 'if ($version $_) { $$$ }'
Length of output: 388
Script:
#!/bin/bash
# Check version-related code in the identified files
echo "=== Content of decoding_methods.hpp ==="
rg -A 5 "get_version|VersionKey" src/clp_ffi_js/ir/decoding_methods.hpp
echo -e "\n=== Content of StreamReader.cpp ==="
rg -A 5 "get_version|VersionKey" src/clp_ffi_js/ir/StreamReader.cpp
# Look for version comparisons or validations
echo -e "\n=== Version comparisons or checks ==="
rg "version|Version" --type cpp -C 3
Length of output: 5891
src/clp_ffi_js/ir/StreamReader.hpp (3)
28-29
: Excellent improvement to memory management!
The shift to std::unique_ptr
return type enforces RAII principles and makes ownership semantics explicit. The [[nodiscard]]
attribute helps prevent accidental resource leaks.
32-32
: Virtual destructor is correctly defaulted
This enables proper cleanup of derived classes through base class pointers.
87-88
: Protected constructor correctly enforces base class design
The protected defaulted constructor properly prevents direct instantiation while enabling derived class construction, which aligns well with the factory method pattern.
src/clp_ffi_js/ir/StreamReader.cpp (3)
5-6
: LGTM: New includes support error handling improvements
The addition of ErrorCode.hpp and format headers properly supports the enhanced error handling mechanism.
Line range hint 21-53
: LGTM: Improved memory management and error handling
The changes demonstrate good practices:
- Use of std::unique_ptr for automatic resource management
- Proper move semantics for transferring ownership
- Comprehensive error handling with detailed messages
Line range hint 58-77
: LGTM: Emscripten bindings properly updated
The bindings correctly reflect the class restructuring and include appropriate memory management policies.
src/clp_ffi_js/ir/IrStreamReader.hpp (2)
32-47
: Well-structured class declaration with proper memory management.
The class follows modern C++ best practices:
- Appropriate use of friend declaration for the factory pattern
- Proper handling of special member functions
- Clear documentation of design decisions
64-64
: Previous review comment about emscripten::val parameter is still applicable.
CMakeLists.txt (1)
117-117
: The addition of IrStreamReader.cpp looks good, eh!
The source file addition aligns with the PR objectives for extracting the StreamReader base class.
Let's verify the presence of corresponding header files:
✅ Verification successful
Beauty! The header files are all present and properly organized, eh!
Found all the expected header files in the src/clp_ffi_js/ir/
directory:
- IrStreamReader.hpp
- StreamReader.hpp
- StreamReaderDataContext.hpp
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the presence of header files for IrStreamReader and StreamReader
# Test: Check if the header files exist
fd -e hpp -e h "IrStreamReader|StreamReader" src/clp_ffi_js/ir/
Length of output: 183
auto get_version(clp::ReaderInterface& reader) -> std::string { | ||
// The encoding type bytes must be consumed before the metadata can be read. | ||
rewind_reader_and_validate_encoding_type(reader); | ||
|
||
// Deserialize metadata bytes from preamble. | ||
clp::ffi::ir_stream::encoded_tag_t metadata_type{}; | ||
std::vector<int8_t> metadata_bytes; | ||
auto const deserialize_preamble_result{ | ||
clp::ffi::ir_stream::deserialize_preamble(reader, metadata_type, metadata_bytes) | ||
}; | ||
if (clp::ffi::ir_stream::IRErrorCode::IRErrorCode_Success != deserialize_preamble_result) { | ||
SPDLOG_CRITICAL( | ||
"Failed to deserialize preamble for version reading: {}", | ||
deserialize_preamble_result | ||
); | ||
throw ClpFfiJsException{ | ||
clp::ErrorCode::ErrorCode_Failure, | ||
__FILENAME__, | ||
__LINE__, | ||
"Failed to deserialize preamble for version reading." | ||
}; | ||
} | ||
|
||
// Deserialize metadata bytes which is encoded in JSON. | ||
std::string_view const metadata_view{ | ||
clp::size_checked_pointer_cast<char const>(metadata_bytes.data()), | ||
metadata_bytes.size() | ||
}; | ||
nlohmann::json const metadata = nlohmann::json::parse(metadata_view); | ||
|
||
// Retrieve version from metadata. | ||
auto const& version{metadata.at(clp::ffi::ir_stream::cProtocol::Metadata::VersionKey)}; | ||
SPDLOG_INFO("The version is {}", version); | ||
|
||
return version; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider adding more robust error handling and validation
While the basic error handling is present, consider these improvements:
- Add validation for the JSON structure and version format
- Consider using
string_view
for better memory efficiency - Provide more specific error messages in exceptions
Here's a suggested improvement:
-auto get_version(clp::ReaderInterface& reader) -> std::string {
+auto get_version(clp::ReaderInterface& reader) -> std::string_view {
// The encoding type bytes must be consumed before the metadata can be read.
rewind_reader_and_validate_encoding_type(reader);
// Deserialize metadata bytes from preamble.
clp::ffi::ir_stream::encoded_tag_t metadata_type{};
std::vector<int8_t> metadata_bytes;
auto const deserialize_preamble_result{
clp::ffi::ir_stream::deserialize_preamble(reader, metadata_type, metadata_bytes)
};
if (clp::ffi::ir_stream::IRErrorCode::IRErrorCode_Success != deserialize_preamble_result) {
SPDLOG_CRITICAL(
- "Failed to deserialize preamble for version reading: {}",
+ "Failed to deserialize preamble: error code {}",
deserialize_preamble_result
);
throw ClpFfiJsException{
clp::ErrorCode::ErrorCode_Failure,
__FILENAME__,
__LINE__,
- "Failed to deserialize preamble for version reading."
+ fmt::format("Failed to deserialize preamble: error code {}", deserialize_preamble_result)
};
}
// Deserialize metadata bytes which is encoded in JSON.
std::string_view const metadata_view{
clp::size_checked_pointer_cast<char const>(metadata_bytes.data()),
metadata_bytes.size()
};
- nlohmann::json const metadata = nlohmann::json::parse(metadata_view);
+ nlohmann::json metadata;
+ try {
+ metadata = nlohmann::json::parse(metadata_view);
+ } catch (const nlohmann::json::parse_error& e) {
+ SPDLOG_CRITICAL("Failed to parse metadata JSON: {}", e.what());
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_MetadataCorrupted,
+ __FILENAME__,
+ __LINE__,
+ fmt::format("Failed to parse metadata JSON: {}", e.what())
+ };
+ }
// Retrieve version from metadata.
- auto const& version{metadata.at(clp::ffi::ir_stream::cProtocol::Metadata::VersionKey)};
+ try {
+ auto const& version{metadata.at(clp::ffi::ir_stream::cProtocol::Metadata::VersionKey)};
+ if (!version.is_string()) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_MetadataCorrupted,
+ __FILENAME__,
+ __LINE__,
+ "Version in metadata is not a string"
+ };
+ }
+ } catch (const nlohmann::json::out_of_range& e) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_MetadataCorrupted,
+ __FILENAME__,
+ __LINE__,
+ fmt::format("Version key not found in metadata: {}", e.what())
+ };
+ }
SPDLOG_INFO("The version is {}", version);
return version;
}
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the rabbit on all those 3 items thought the error messages in the log and throwable should be consistent (don't apply the suggested code directly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, understood. I'll ensure that error messages in logs and exceptions are consistent in future suggestions.
✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/decoding_methods.cpp:45-80
Timestamp: 2024-11-04T21:02:17.823Z
Learning: Ensure that error messages in logs and exceptions are consistent.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction: it is a bad idea to return a string_view in this method since metadata
goes out of scope when this function returns. Let's address only 1 & 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, thank you for pointing that out. You're correct; returning a std::string_view
in this case is unsafe since metadata
goes out of scope when the function returns. Returning a std::string
is the appropriate choice to ensure the version string remains valid.
✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/decoding_methods.cpp:45-80
Timestamp: 2024-11-04T21:06:24.293Z
Learning: In `src/clp_ffi_js/ir/decoding_methods.cpp`, within the `get_version(clp::ReaderInterface&)` function, avoid returning `std::string_view` when referencing local data that goes out of scope. Instead, return `std::string` to ensure data validity.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of coderabbit suggestion incliude printing things that are not printable, but otherwise did comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
src/clp_ffi_js/ir/StreamReader.cpp
Outdated
auto const version{get_version(*zstd_decompressor)}; | ||
if (version == "v0.0.1" || version == "v0.0.0" || version == "0.0.1" || version == "0.0.0") { | ||
auto stream_reader_data_context{IrStreamReader::create_data_context( | ||
std::move(zstd_decompressor), | ||
std::move(data_buffer) | ||
)}; | ||
m_encoded_log_events.emplace_back(std::move(log_viewer_event)); | ||
} | ||
m_stream_reader_data_context.reset(nullptr); | ||
return m_encoded_log_events.size(); | ||
} | ||
|
||
auto StreamReader::decode_range(size_t begin_idx, size_t end_idx, bool use_filter) const | ||
-> DecodedResultsTsType { | ||
if (use_filter && false == m_filtered_log_event_map.has_value()) { | ||
return DecodedResultsTsType{emscripten::val::null()}; | ||
} | ||
|
||
size_t length{0}; | ||
if (use_filter) { | ||
length = m_filtered_log_event_map->size(); | ||
} else { | ||
length = m_encoded_log_events.size(); | ||
} | ||
if (length < end_idx || begin_idx > end_idx) { | ||
return DecodedResultsTsType{emscripten::val::null()}; | ||
} | ||
|
||
std::string message; | ||
constexpr size_t cDefaultReservedMessageLength{512}; | ||
message.reserve(cDefaultReservedMessageLength); | ||
auto const results{emscripten::val::array()}; | ||
|
||
for (size_t i = begin_idx; i < end_idx; ++i) { | ||
size_t log_event_idx{0}; | ||
if (use_filter) { | ||
log_event_idx = m_filtered_log_event_map->at(i); | ||
} else { | ||
log_event_idx = i; | ||
} | ||
auto const& log_event{m_encoded_log_events[log_event_idx]}; | ||
|
||
auto const parsed{log_event.get_message().decode_and_unparse()}; | ||
if (false == parsed.has_value()) { | ||
SPDLOG_ERROR("Failed to decode message."); | ||
break; | ||
} | ||
message = parsed.value(); | ||
|
||
m_ts_pattern.insert_formatted_timestamp(log_event.get_timestamp(), message); | ||
|
||
EM_ASM( | ||
{ Emval.toValue($0).push([UTF8ToString($1), $2, $3, $4]); }, | ||
results.as_handle(), | ||
message.c_str(), | ||
log_event.get_timestamp(), | ||
log_event.get_log_level(), | ||
log_event_idx + 1 | ||
return std::unique_ptr<IrStreamReader>( | ||
new IrStreamReader(std::move(stream_reader_data_context)) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using version constants and semantic version parsing
The hard-coded version strings ("v0.0.1", "v0.0.0", "0.0.1", "0.0.0") could be difficult to maintain. Consider:
- Defining these as constants
- Implementing semantic version parsing to handle versions more robustly
- Using a version comparison utility to support future versions
Example improvement:
// In a header file:
namespace clp_ffi_js::ir {
constexpr std::array<std::string_view, 4> SUPPORTED_VERSIONS = {
"v0.0.1", "v0.0.0", "0.0.1", "0.0.0"
};
}
// In the implementation:
if (std::ranges::find(SUPPORTED_VERSIONS, version) != SUPPORTED_VERSIONS.end()) {
// ... existing creation logic
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given an IRv1 version validate util function is not to be added to CLP-core soon, I agree with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, understood. I'll note that an IRv1 version validation utility function is not planned to be added to CLP-core in the near future.
✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-04T20:55:01.575Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
void IrStreamReader::filter_log_events(emscripten::val const& log_level_filter) { | ||
if (log_level_filter.isNull()) { | ||
m_filtered_log_event_map.reset(); | ||
return; | ||
} | ||
|
||
m_filtered_log_event_map.emplace(); | ||
auto filter_levels{emscripten::vecFromJSArray<std::underlying_type_t<LogLevel>>(log_level_filter | ||
)}; | ||
for (size_t log_event_idx = 0; log_event_idx < m_encoded_log_events.size(); ++log_event_idx) { | ||
auto const& log_event = m_encoded_log_events[log_event_idx]; | ||
if (std::ranges::find( | ||
filter_levels, | ||
clp::enum_to_underlying_type(log_event.get_log_level()) | ||
) | ||
!= filter_levels.end()) | ||
{ | ||
m_filtered_log_event_map->emplace_back(log_event_idx); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a type check for log_level_filter
Currently, the function assumes that log_level_filter
is a valid array. To prevent potential runtime errors if a non-array is passed from JavaScript, consider adding a type check and throwing an exception if the input is invalid.
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
for (size_t i = begin_idx; i < end_idx; ++i) { | ||
size_t log_event_idx{0}; | ||
if (use_filter) { | ||
log_event_idx = m_filtered_log_event_map->at(i); | ||
} else { | ||
log_event_idx = i; | ||
} | ||
auto const& log_event{m_encoded_log_events[log_event_idx]}; | ||
|
||
auto const parsed{log_event.get_message().decode_and_unparse()}; | ||
if (false == parsed.has_value()) { | ||
SPDLOG_ERROR("Failed to decode message."); | ||
break; | ||
} | ||
message = parsed.value(); | ||
|
||
m_ts_pattern.insert_formatted_timestamp(log_event.get_timestamp(), message); | ||
|
||
EM_ASM( | ||
{ Emval.toValue($0).push([UTF8ToString($1), $2, $3, $4]); }, | ||
results.as_handle(), | ||
message.c_str(), | ||
log_event.get_timestamp(), | ||
log_event.get_log_level(), | ||
log_event_idx + 1 | ||
); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimize the loop in decode_range
Instead of manually constructing the results
array using EM_ASM
, consider using a more direct approach with Emscripten bindings to improve readability and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be explored in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, understood.
🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
auto const parsed{log_event.get_message().decode_and_unparse()}; | ||
if (false == parsed.has_value()) { | ||
SPDLOG_ERROR("Failed to decode message."); | ||
break; | ||
} | ||
message = parsed.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle message decoding failures more explicitly
In the decode_range
method, when message decoding fails, the code logs an error and breaks the loop. This might result in partial results without informing the caller of the failure. Consider throwing an exception to notify the caller about the decoding issue.
You can apply this diff to throw an exception upon decoding failure:
auto const parsed{log_event.get_message().decode_and_unparse()};
if (false == parsed.has_value()) {
SPDLOG_ERROR("Failed to decode message.");
- break;
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_Corrupt,
+ __FILENAME__,
+ __LINE__,
+ "Failed to decode message"
+ };
}
message = parsed.value();
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
auto const parsed{log_event.get_message().decode_and_unparse()}; | |
if (false == parsed.has_value()) { | |
SPDLOG_ERROR("Failed to decode message."); | |
break; | |
} | |
message = parsed.value(); | |
auto const parsed{log_event.get_message().decode_and_unparse()}; | |
if (false == parsed.has_value()) { | |
SPDLOG_ERROR("Failed to decode message."); | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Corrupt, | |
__FILENAME__, | |
__LINE__, | |
"Failed to decode message" | |
}; | |
} | |
message = parsed.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to be the same as the above comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, thank you for pointing that out. It seems the comment was duplicated. Sorry for any confusion.
🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
if (std::errc::no_message_available == error) { | ||
break; | ||
} | ||
if (std::errc::result_out_of_range == error) { | ||
SPDLOG_ERROR("File contains an incomplete IR stream"); | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uniform error handling upon deserialization failure
When encountering std::errc::result_out_of_range
, the code logs an error and breaks the loop. To maintain consistency and ensure the caller is properly informed of all deserialization failures, consider throwing an exception here as well.
Here's a suggested change:
if (std::errc::result_out_of_range == error) {
SPDLOG_ERROR("File contains an incomplete IR stream");
- break;
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_Corrupt,
+ __FILENAME__,
+ __LINE__,
+ "File contains an incomplete IR stream"
+ };
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (std::errc::no_message_available == error) { | |
break; | |
} | |
if (std::errc::result_out_of_range == error) { | |
SPDLOG_ERROR("File contains an incomplete IR stream"); | |
break; | |
} | |
if (std::errc::no_message_available == error) { | |
break; | |
} | |
if (std::errc::result_out_of_range == error) { | |
SPDLOG_ERROR("File contains an incomplete IR stream"); | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Corrupt, | |
__FILENAME__, | |
__LINE__, | |
"File contains an incomplete IR stream" | |
}; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not throw an exception here because we intend to handle missing End-of-Stream packet gracefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, understood. Thank you for the clarification.
✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:85-91
Timestamp: 2024-11-04T19:37:26.993Z
Learning: In the `IrStreamReader::deserialize_stream` method, we handle missing End-of-Stream packets gracefully by logging an error and breaking the loop when encountering `std::errc::result_out_of_range`, rather than throwing an exception.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
src/clp_ffi_js/ir/IrStreamReader.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides renaming the class from StreamReader
to IrStreamReader
, did we make any other changes to this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed create()
since we have effectively moved this to base class. I removed encripten bindings and put them all in base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
src/clp_ffi_js/ir/StreamReader.cpp (1)
5-18
: Reorganize includes following standard convention.The includes should be organized in the following groups, separated by blank lines:
- C++ standard library headers
- Third-party library headers
- Project headers
Apply this organization:
#include <cstddef> #include <cstdint> #include <format> #include <memory> #include <utility> #include <clp/Array.hpp> -#include <clp/ErrorCode.hpp> #include <clp/streaming_compression/zstd/Decompressor.hpp> #include <clp/TraceableException.hpp> +#include <clp/ErrorCode.hpp> #include <emscripten/bind.h> #include <spdlog/spdlog.h> #include <clp_ffi_js/ClpFfiJsException.hpp> #include <clp_ffi_js/ir/decoding_methods.hpp> #include <clp_ffi_js/ir/IrStreamReader.hpp>src/clp_ffi_js/ir/IrStreamReader.cpp (2)
102-119
: Consider extracting log level parsing logic.The log level extraction logic is complex and could be moved to a separate method for better readability and maintainability.
+auto parse_log_level(std::string_view logtype) -> LogLevel { + constexpr size_t cLogLevelPositionInMessages{1}; + if (logtype.length() <= cLogLevelPositionInMessages) { + return LogLevel::NONE; + } + + auto const log_level_name_it{std::find_if( + cLogLevelNames.begin() + static_cast<size_t>(cValidLogLevelsBeginIdx), + cLogLevelNames.end(), + [&](std::string_view level) { + return logtype.substr(cLogLevelPositionInMessages).starts_with(level); + } + )}; + + return (log_level_name_it != cLogLevelNames.end()) + ? static_cast<LogLevel>(std::distance(cLogLevelNames.begin(), log_level_name_it)) + : LogLevel::NONE; +} - constexpr size_t cLogLevelPositionInMessages{1}; - LogLevel log_level{LogLevel::NONE}; - if (logtype.length() > cLogLevelPositionInMessages) { - auto const log_level_name_it{std::find_if( - cLogLevelNames.begin() + static_cast<size_t>(cValidLogLevelsBeginIdx), - cLogLevelNames.end(), - [&](std::string_view level) { - return logtype.substr(cLogLevelPositionInMessages).starts_with(level); - } - )}; - if (log_level_name_it != cLogLevelNames.end()) { - log_level = static_cast<LogLevel>( - std::distance(cLogLevelNames.begin(), log_level_name_it) - ); - } - } + auto const log_level{parse_log_level(logtype)};
177-185
: Add documentation for the EM_ASM block.The EM_ASM block performs critical JavaScript array manipulation, but its purpose and structure could be better documented.
+ // EM_ASM block pushes a tuple [message, timestamp, log_level, index] to the JavaScript array + // Using direct EM_ASM for performance instead of higher-level emscripten::val operations EM_ASM( { Emval.toValue($0).push([UTF8ToString($1), $2, $3, $4]); }, results.as_handle(),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
src/clp_ffi_js/ir/IrStreamReader.cpp
(1 hunks)src/clp_ffi_js/ir/IrStreamReader.hpp
(1 hunks)src/clp_ffi_js/ir/StreamReader.cpp
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/clp_ffi_js/ir/IrStreamReader.hpp
🧰 Additional context used
📓 Learnings (2)
src/clp_ffi_js/ir/IrStreamReader.cpp (3)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:85-91
Timestamp: 2024-11-04T19:37:27.204Z
Learning: In the `IrStreamReader::deserialize_stream` method, we handle missing End-of-Stream packets gracefully by logging an error and breaking the loop when encountering `std::errc::result_out_of_range`, rather than throwing an exception.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:162-168
Timestamp: 2024-11-04T19:55:27.398Z
Learning: In the `IrStreamReader::decode_range` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, message decoding failures are not expected unless the file is corrupted or the deserializer in CLP core does not handle edge cases, so it's appropriate to throw an exception upon message decoding failure.
src/clp_ffi_js/ir/StreamReader.cpp (1)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-04T20:55:01.745Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
🔇 Additional comments (7)
src/clp_ffi_js/ir/StreamReader.cpp (3)
35-45
: Implement version checking using constants.
Based on the previous review discussion and the learning that "An IRv1 version validation utility function is not planned to be added to CLP-core soon", we should at least make the current version checking more maintainable.
46-52
: Well-implemented error handling with proper logging and detailed exception.
The error handling follows best practices by:
- Using SPDLOG_CRITICAL for logging
- Throwing a custom exception with detailed information
- Using std::format for clear error messages
Line range hint 58-77
: Well-structured Emscripten bindings with proper memory management.
The bindings implementation:
- Correctly registers TypeScript types for interoperability
- Uses appropriate memory management with take_ownership
- Provides a clean interface for JavaScript/TypeScript consumers
src/clp_ffi_js/ir/IrStreamReader.cpp (4)
1-38
: LGTM! Well-organized includes and namespace setup.
The includes are properly organized into logical groups (standard library, external libraries, and project headers), and the using declarations are appropriately scoped within the namespace.
39-71
: LGTM! Efficient event filtering implementation.
The event management methods are well-implemented with:
- Proper handling of optional filtered map
- Efficient use of std::ranges for filtering
- Type-safe handling of log levels through TypeScript typing (as per previous learnings)
73-98
: LGTM! Robust error handling in stream deserialization.
The error handling follows the established pattern:
- Gracefully handles missing End-of-Stream packets
- Properly throws exceptions for corruption errors
- Clear error messages with detailed context
190-223
: LGTM! Robust constructor and data context creation.
The implementation demonstrates good practices:
- Proper use of move semantics
- RAII principles in constructor
- Thorough error handling with detailed messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
src/clp_ffi_js/ir/IrStreamReader.cpp (3)
78-79
: Consider making the magic number configurable.The default number of reserved log events (500,000) could be made configurable through a constant or configuration parameter for better flexibility.
- constexpr size_t cDefaultNumReservedLogEvents{500'000}; + static constexpr size_t cDefaultNumReservedLogEvents{500'000}; // TODO: Make configurable
103-119
: Consider extracting log level parsing logic.The log level parsing logic is complex enough to warrant its own method for better readability and maintainability.
+ auto parse_log_level(std::string_view logtype) -> LogLevel { + constexpr size_t cLogLevelPositionInMessages{1}; + if (logtype.length() <= cLogLevelPositionInMessages) { + return LogLevel::NONE; + } + + auto const log_level_name_it{std::find_if( + cLogLevelNames.begin() + static_cast<size_t>(cValidLogLevelsBeginIdx), + cLogLevelNames.end(), + [&](std::string_view level) { + return logtype.substr(cLogLevelPositionInMessages).starts_with(level); + } + )}; + + if (log_level_name_it == cLogLevelNames.end()) { + return LogLevel::NONE; + } + + return static_cast<LogLevel>( + std::distance(cLogLevelNames.begin(), log_level_name_it) + ); + }
177-185
: Consider using Embind instead of EM_ASM for better type safety.The current implementation uses EM_ASM for array manipulation. Consider using Embind's val::array() methods for better type safety and maintainability.
- EM_ASM( - { Emval.toValue($0).push([UTF8ToString($1), $2, $3, $4]); }, - results.as_handle(), - message.c_str(), - log_event.get_timestamp(), - log_event.get_log_level(), - log_event_idx + 1 - ); + results.call<void>("push", emscripten::val::array( + std::vector<emscripten::val>{ + emscripten::val(message), + emscripten::val(log_event.get_timestamp()), + emscripten::val(log_event.get_log_level()), + emscripten::val(log_event_idx + 1) + }));
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
src/clp_ffi_js/ir/IrStreamReader.cpp
(1 hunks)src/submodules/clp
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/submodules/clp
🧰 Additional context used
📓 Learnings (1)
src/clp_ffi_js/ir/IrStreamReader.cpp (3)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:85-91
Timestamp: 2024-11-04T19:37:27.204Z
Learning: In the `IrStreamReader::deserialize_stream` method, we handle missing End-of-Stream packets gracefully by logging an error and breaking the loop when encountering `std::errc::result_out_of_range`, rather than throwing an exception.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:162-168
Timestamp: 2024-11-04T19:55:27.398Z
Learning: In the `IrStreamReader::decode_range` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, message decoding failures are not expected unless the file is corrupted or the deserializer in CLP core does not handle edge cases, so it's appropriate to throw an exception upon message decoding failure.
🔇 Additional comments (4)
src/clp_ffi_js/ir/IrStreamReader.cpp (4)
1-38
: LGTM! Well-organized includes and namespace setup.
The includes are logically grouped and the namespace usage is appropriate.
39-41
: LGTM! Simple and effective accessor method.
190-197
: LGTM! Constructor properly initializes member variables.
The constructor correctly moves the data context and initializes the timestamp pattern.
199-223
: LGTM! Robust error handling in data context creation.
The method properly validates the decompressor and handles errors appropriately.
src/clp_ffi_js/ir/StreamReader.cpp
Outdated
@@ -3,249 +3,122 @@ | |||
#include <algorithm> | |||
#include <cstddef> | |||
#include <cstdint> | |||
#include <iterator> | |||
#include <format> | |||
#include <json/single_include/nlohmann/json.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add json
to src/clp-ffi-js/.clang-format
.
src/clp_ffi_js/ir/StreamReader.cpp
Outdated
// Required to validate encoding type prior to getting version. | ||
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | ||
auto const version{get_version(*zstd_decompressor)}; | ||
|
||
// Required that reader offset matches position after validation in order to decode log events. | ||
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | ||
if (std::ranges::find(cUnstructuredIrVersions, version) != cUnstructuredIrVersions.end()) { | ||
return std::make_unique<UnstructuredIrStreamReader>(UnstructuredIrStreamReader::create( | ||
std::move(zstd_decompressor), | ||
std::move(data_buffer) | ||
)); | ||
} | ||
SPDLOG_INFO("did i get here 3"); | ||
|
||
throw ClpFfiJsException{ | ||
clp::ErrorCode::ErrorCode_Unsupported, | ||
__FILENAME__, | ||
__LINE__, | ||
std::format("Unable to create reader for IR stream with version {}.", version) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Required to validate encoding type prior to getting version. | |
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | |
auto const version{get_version(*zstd_decompressor)}; | |
// Required that reader offset matches position after validation in order to decode log events. | |
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | |
if (std::ranges::find(cUnstructuredIrVersions, version) != cUnstructuredIrVersions.end()) { | |
return std::make_unique<UnstructuredIrStreamReader>(UnstructuredIrStreamReader::create( | |
std::move(zstd_decompressor), | |
std::move(data_buffer) | |
)); | |
} | |
SPDLOG_INFO("did i get here 3"); | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Unsupported, | |
__FILENAME__, | |
__LINE__, | |
std::format("Unable to create reader for IR stream with version {}.", version) | |
}; | |
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | |
// Validate the stream's version | |
auto pos = zstd_decompressor->get_pos(); | |
auto const version{get_version(*zstd_decompressor)}; | |
if (std::ranges::find(cUnstructuredIrVersions, version) == cUnstructuredIrVersions.end()) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Unsupported, | |
__FILENAME__, | |
__LINE__, | |
std::format("Unable to create reader for IR stream with version {}.", version) | |
}; | |
} | |
try { | |
zstd_decompressor->seek_from_begin(pos); | |
} catch (ZstdDecompressor::OperationFailed& e) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Failure, | |
__FILENAME__, | |
__LINE__, | |
std::format("Unable to rewind zstd decompressor: {}", e.what()) | |
}; | |
} | |
return std::make_unique<UnstructuredIrStreamReader>( | |
UnstructuredIrStreamReader::create(std::move(zstd_decompressor), std::move(data_buffer)) | |
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does a few things:
- Refactors the code structure a bit in terms of comments and flow
- Note that we handle the error path in the if block rather than the previous code which handled the success path in the if block
- Updated the flow to avoid re-validating the encoding type twice:
- Rewind + validate encoding type
- Save position
- Validate version
- Restore position
- Create and return the reader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better for you to copy and commit this manually rather than applying the change from GH, just so that you can double-check what I did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is did you test it... I tried something similiar but it did not work. I will test it for you and let u know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, hence why I wanted you to double-check, lol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay it seems to work now. I dont know why mine was giving an error, but wtv i guess.
*/ | ||
[[nodiscard]] static auto create( | ||
std::unique_ptr<ZstdDecompressor>&& zstd_decompressor, | ||
clp::Array<char>&& data_array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's pass this as a value for the same reason I discussed with Junhao in a previous PR. Essentially, passing it as a value and moving it in the caller offers maximum flexibility. We can't do the same for zstd_decompressor because the class doesn't support it and likely won't any time soon.
src/clp_ffi_js/ir/IrStreamReader.hpp
Outdated
void filter_log_events(LogLevelFilterTsType const& log_level_filter) override; | ||
|
||
/** | ||
* Deserializes all log events in the stream. After the stream has been exhausted, it will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, I see you've deleted this docstring. I'm fine keeping it this way, but if @junhaoliao wants to keep the part about deallocating the stream after exhaustion, then we need to add the docstring back in UnstructuredIrStreamReader
.
Co-authored-by: kirkrodrigues <[email protected]>
Co-authored-by: kirkrodrigues <[email protected]>
Co-authored-by: kirkrodrigues <[email protected]>
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
src/clp_ffi_js/ir/StreamReader.cpp (1)
168-168
: Remove debug log message.The debug message "did i get here 3" was left in accidentally and should be removed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
src/clp_ffi_js/ir/StreamReader.cpp
(2 hunks)src/clp_ffi_js/ir/StreamReader.hpp
(3 hunks)src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/clp_ffi_js/ir/StreamReader.hpp
- src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
🧰 Additional context used
📓 Learnings (1)
src/clp_ffi_js/ir/StreamReader.cpp (3)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-04T20:55:01.745Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:162-168
Timestamp: 2024-11-04T19:55:27.398Z
Learning: In the `IrStreamReader::decode_range` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, message decoding failures are not expected unless the file is corrupted or the deserializer in CLP core does not handle edge cases, so it's appropriate to throw an exception upon message decoding failure.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:85-91
Timestamp: 2024-11-04T19:37:27.204Z
Learning: In the `IrStreamReader::deserialize_stream` method, we handle missing End-of-Stream packets gracefully by logging an error and breaking the loop when encountering `std::errc::result_out_of_range`, rather than throwing an exception.
🪛 cppcheck
src/clp_ffi_js/ir/StreamReader.cpp
[error] 111-111: syntax error
(syntaxError)
🔇 Additional comments (1)
src/clp_ffi_js/ir/StreamReader.cpp (1)
7-7
: Add json to .clang-format configuration.
As noted in previous reviews, the json include needs to be added to src/clp_ffi-js/.clang-format
to ensure consistent formatting.
src/clp_ffi_js/ir/StreamReader.cpp
Outdated
// Required to validate encoding type prior to getting version. | ||
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | ||
auto const version{get_version(*zstd_decompressor)}; | ||
|
||
// Required that reader offset matches position after validation in order to decode log events. | ||
rewind_reader_and_validate_encoding_type(*zstd_decompressor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimize stream validation flow.
The encoding type is validated twice unnecessarily. As suggested in previous reviews, we can optimize this by:
- Rewind + validate encoding type once
- Save position
- Validate version
- Restore position
- Create and return the reader
Apply this restructuring:
- // Required to validate encoding type prior to getting version.
- rewind_reader_and_validate_encoding_type(*zstd_decompressor);
+ rewind_reader_and_validate_encoding_type(*zstd_decompressor);
+
+ // Save position after encoding validation
+ auto pos = zstd_decompressor->get_pos();
+
+ // Get and validate version
auto const version{get_version(*zstd_decompressor)};
-
- // Required that reader offset matches position after validation in order to decode log events.
- rewind_reader_and_validate_encoding_type(*zstd_decompressor);
+
+ // Restore position for decoding
+ try {
+ zstd_decompressor->seek_from_begin(pos);
+ } catch (ZstdDecompressor::OperationFailed& e) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_Failure,
+ __FILENAME__,
+ __LINE__,
+ std::format("Unable to rewind zstd decompressor: {}", e.what())
+ };
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Required to validate encoding type prior to getting version. | |
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | |
auto const version{get_version(*zstd_decompressor)}; | |
// Required that reader offset matches position after validation in order to decode log events. | |
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | |
rewind_reader_and_validate_encoding_type(*zstd_decompressor); | |
// Save position after encoding validation | |
auto pos = zstd_decompressor->get_pos(); | |
// Get and validate version | |
auto const version{get_version(*zstd_decompressor)}; | |
// Restore position for decoding | |
try { | |
zstd_decompressor->seek_from_begin(pos); | |
} catch (ZstdDecompressor::OperationFailed& e) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Failure, | |
__FILENAME__, | |
__LINE__, | |
std::format("Unable to rewind zstd decompressor: {}", e.what()) | |
}; | |
} |
auto get_version(clp::ReaderInterface& reader) -> std::string { | ||
// Deserialize metadata bytes from preamble. | ||
clp::ffi::ir_stream::encoded_tag_t metadata_type{}; | ||
std::vector<int8_t> metadata_bytes; | ||
auto const err{clp::ffi::ir_stream::deserialize_preamble(reader, metadata_type, metadata_bytes) | ||
}; | ||
if (result.has_error()) { | ||
auto const error_code{result.error()}; | ||
SPDLOG_CRITICAL( | ||
"Failed to create deserializer: {}:{}", | ||
error_code.category().name(), | ||
error_code.message() | ||
); | ||
if (IRErrorCode::IRErrorCode_Success != err) { | ||
throw ClpFfiJsException{ | ||
clp::ErrorCode::ErrorCode_Failure, | ||
__FILENAME__, | ||
__LINE__, | ||
"Failed to create deserializer" | ||
std::format( | ||
"Failed to deserialize preamble: IR error code {}", | ||
clp::enum_to_underlying_type(err) | ||
) | ||
}; | ||
} | ||
|
||
return {std::move(data_buffer), std::move(zstd_decompressor), std::move(result.value())}; | ||
std::string version; | ||
try { | ||
// Deserialize metadata bytes as JSON. | ||
std::string_view const metadata_view{ | ||
clp::size_checked_pointer_cast<char const>(metadata_bytes.data()), | ||
metadata_bytes.size() | ||
}; | ||
nlohmann::json const metadata = nlohmann::json::parse(metadata_view); | ||
version = metadata.at(clp::ffi::ir_stream::cProtocol::Metadata::VersionKey); | ||
} catch (nlohmann::json::exception const& e) { | ||
throw ClpFfiJsException{ | ||
clp::ErrorCode::ErrorCode_MetadataCorrupted, | ||
__FILENAME__, | ||
__LINE__, | ||
std::format("Failed to parse stream's metadata: {}", e.what()) | ||
}; | ||
} | ||
|
||
SPDLOG_INFO("IR version is {}", version); | ||
return version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add metadata type validation.
The function deserializes the preamble but doesn't validate if the metadata_type is correct before attempting to parse it as JSON.
Add validation like this:
if (IRErrorCode::IRErrorCode_Success != err) {
throw ClpFfiJsException{
clp::ErrorCode::ErrorCode_Failure,
__FILENAME__,
__LINE__,
std::format(
"Failed to deserialize preamble: IR error code {}",
clp::enum_to_underlying_type(err)
)
};
}
+ if (metadata_type != clp::ffi::ir_stream::cProtocol::Tags::Metadata) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_MetadataCorrupted,
+ __FILENAME__,
+ __LINE__,
+ std::format("Invalid metadata type: {}", metadata_type)
+ };
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
auto get_version(clp::ReaderInterface& reader) -> std::string { | |
// Deserialize metadata bytes from preamble. | |
clp::ffi::ir_stream::encoded_tag_t metadata_type{}; | |
std::vector<int8_t> metadata_bytes; | |
auto const err{clp::ffi::ir_stream::deserialize_preamble(reader, metadata_type, metadata_bytes) | |
}; | |
if (result.has_error()) { | |
auto const error_code{result.error()}; | |
SPDLOG_CRITICAL( | |
"Failed to create deserializer: {}:{}", | |
error_code.category().name(), | |
error_code.message() | |
); | |
if (IRErrorCode::IRErrorCode_Success != err) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Failure, | |
__FILENAME__, | |
__LINE__, | |
"Failed to create deserializer" | |
std::format( | |
"Failed to deserialize preamble: IR error code {}", | |
clp::enum_to_underlying_type(err) | |
) | |
}; | |
} | |
return {std::move(data_buffer), std::move(zstd_decompressor), std::move(result.value())}; | |
std::string version; | |
try { | |
// Deserialize metadata bytes as JSON. | |
std::string_view const metadata_view{ | |
clp::size_checked_pointer_cast<char const>(metadata_bytes.data()), | |
metadata_bytes.size() | |
}; | |
nlohmann::json const metadata = nlohmann::json::parse(metadata_view); | |
version = metadata.at(clp::ffi::ir_stream::cProtocol::Metadata::VersionKey); | |
} catch (nlohmann::json::exception const& e) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_MetadataCorrupted, | |
__FILENAME__, | |
__LINE__, | |
std::format("Failed to parse stream's metadata: {}", e.what()) | |
}; | |
} | |
SPDLOG_INFO("IR version is {}", version); | |
return version; | |
auto get_version(clp::ReaderInterface& reader) -> std::string { | |
// Deserialize metadata bytes from preamble. | |
clp::ffi::ir_stream::encoded_tag_t metadata_type{}; | |
std::vector<int8_t> metadata_bytes; | |
auto const err{clp::ffi::ir_stream::deserialize_preamble(reader, metadata_type, metadata_bytes) | |
}; | |
if (IRErrorCode::IRErrorCode_Success != err) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_Failure, | |
__FILENAME__, | |
__LINE__, | |
std::format( | |
"Failed to deserialize preamble: IR error code {}", | |
clp::enum_to_underlying_type(err) | |
) | |
}; | |
} | |
if (metadata_type != clp::ffi::ir_stream::cProtocol::Tags::Metadata) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_MetadataCorrupted, | |
__FILENAME__, | |
__LINE__, | |
std::format("Invalid metadata type: {}", metadata_type) | |
}; | |
} | |
std::string version; | |
try { | |
// Deserialize metadata bytes as JSON. | |
std::string_view const metadata_view{ | |
clp::size_checked_pointer_cast<char const>(metadata_bytes.data()), | |
metadata_bytes.size() | |
}; | |
nlohmann::json const metadata = nlohmann::json::parse(metadata_view); | |
version = metadata.at(clp::ffi::ir_stream::cProtocol::Metadata::VersionKey); | |
} catch (nlohmann::json::exception const& e) { | |
throw ClpFfiJsException{ | |
clp::ErrorCode::ErrorCode_MetadataCorrupted, | |
__FILENAME__, | |
__LINE__, | |
std::format("Failed to parse stream's metadata: {}", e.what()) | |
}; | |
} | |
SPDLOG_INFO("IR version is {}", version); | |
return version; |
Co-authored-by: kirkrodrigues <[email protected]>
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
src/clp_ffi_js/ir/StreamReader.cpp (2)
6-26
: Reorganize header includes into logical groups.Based on previous discussions, the headers should be organized into distinct groups:
- C++ standard library headers
- Third-party library headers
- Project-specific headers (clp)
- Local headers
Apply this reorganization:
-#include <format> -#include <memory> -#include <string> -#include <string_view> -#include <utility> -#include <vector> - -#include <clp/Array.hpp> -#include <clp/ErrorCode.hpp> -#include <clp/ffi/ir_stream/decoding_methods.hpp> -#include <clp/ffi/ir_stream/protocol_constants.hpp> -#include <clp/ReaderInterface.hpp> -#include <clp/streaming_compression/zstd/Decompressor.hpp> -#include <clp/TraceableException.hpp> -#include <clp/type_utils.hpp> -#include <emscripten/bind.h> -#include <json/single_include/nlohmann/json.hpp> -#include <spdlog/spdlog.h> - -#include <clp_ffi_js/ClpFfiJsException.hpp> -#include <clp_ffi_js/ir/UnstructuredIrStreamReader.hpp> +// C++ standard library +#include <format> +#include <memory> +#include <string> +#include <string_view> +#include <utility> +#include <vector> + +// Third-party libraries +#include <emscripten/bind.h> +#include <json/single_include/nlohmann/json.hpp> +#include <spdlog/spdlog.h> + +// CLP libraries +#include <clp/Array.hpp> +#include <clp/ErrorCode.hpp> +#include <clp/ffi/ir_stream/decoding_methods.hpp> +#include <clp/ffi/ir_stream/protocol_constants.hpp> +#include <clp/ReaderInterface.hpp> +#include <clp/streaming_compression/zstd/Decompressor.hpp> +#include <clp/TraceableException.hpp> +#include <clp/type_utils.hpp> + +// Local headers +#include <clp_ffi_js/ClpFfiJsException.hpp> +#include <clp_ffi_js/ir/UnstructuredIrStreamReader.hpp>
116-121
: Add newlines before comments for better readability.Following the established coding style, add newlines before comments.
Apply this formatting:
EMSCRIPTEN_BINDINGS(ClpStreamReader) { + // JS types used as inputs emscripten::register_type<clp_ffi_js::ir::DataArrayTsType>("Uint8Array"); emscripten::register_type<clp_ffi_js::ir::LogLevelFilterTsType>("number[] | null"); + // JS types used as outputs🧰 Tools
🪛 cppcheck
[error] 116-116: syntax error
(syntaxError)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
CMakeLists.txt
(1 hunks)src/clp_ffi_js/.clang-format
(1 hunks)src/clp_ffi_js/ir/StreamReader.cpp
(2 hunks)src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp
(1 hunks)src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/clp_ffi_js/.clang-format
🚧 Files skipped from review as they are similar to previous changes (3)
- CMakeLists.txt
- src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp
- src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
🧰 Additional context used
📓 Learnings (1)
src/clp_ffi_js/ir/StreamReader.cpp (2)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-04T20:55:01.745Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:162-168
Timestamp: 2024-11-04T19:55:27.398Z
Learning: In the `IrStreamReader::decode_range` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, message decoding failures are not expected unless the file is corrupted or the deserializer in CLP core does not handle edge cases, so it's appropriate to throw an exception upon message decoding failure.
🪛 cppcheck
src/clp_ffi_js/ir/StreamReader.cpp
[error] 116-116: syntax error
(syntaxError)
🔇 Additional comments (3)
src/clp_ffi_js/ir/StreamReader.cpp (3)
32-48
: LGTM! Well-documented function declarations.
The function declarations are clear and include comprehensive documentation with proper exception specifications.
49-72
: LGTM! Robust encoding type validation.
The function properly handles:
- Reader position reset
- Encoding type validation
- Appropriate error handling with descriptive messages
146-187
: LGTM! Well-structured stream reader creation.
The implementation follows best practices with:
- Proper memory management using smart pointers
- Optimized stream validation flow
- Comprehensive error handling
- Clear logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR title, how about:
Split `StreamReader` into an interface and implementation to prepare for adding another IR stream reader.
StreamReader
into an interface and implementation to prepare for adding another IR stream reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1)
63-70
: Enhance method documentation.The documentation for
deserialize_stream()
only references the base class method without explaining its specific behaviour in this context. Consider adding details about:
- What exactly is being deserialized
- What happens to the stream after exhaustion
- Any specific error conditions
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
(1 hunks)src/submodules/clp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/submodules/clp
🔇 Additional comments (5)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (5)
21-25
: LGTM! Well-documented type alias.
The FilteredLogEventsMap type alias is clearly documented and effectively describes its purpose as a mapping between filtered and unfiltered log event indices.
27-31
: LGTM! Clear and concise class documentation.
The class documentation effectively communicates its purpose and responsibilities.
33-43
: LGTM! Proper RAII implementation.
The class correctly implements RAII with well-defined destructor and move/copy semantics. The rationale for deleting the move assignment operator is clearly documented.
76-87
: LGTM! Well-structured private implementation.
The private section demonstrates good design with:
- Proper encapsulation using private constructor
- Clear member variable organization
- Appropriate use of smart pointers for ownership
31-87
: Verify error handling across FFI boundary.
Since this class handles compressed data and operates across the FFI boundary, let's verify the error handling implementation:
✅ Verification successful
Let me gather more information about memory safety patterns, particularly around unique pointers and RAII usage.
Error handling and memory safety are properly implemented
The verification shows robust error handling and memory safety mechanisms:
- Proper RAII with smart pointers (
unique_ptr
) for managing resources - Consistent error handling using
ClpFfiJsException
with detailed error messages - Clear ownership semantics with move operations and deleted copy operations
- Comprehensive error checks for:
- Corrupt metadata
- Deserialization failures
- Encoding type validation
- Stream version compatibility
- Decompression errors
The implementation follows best practices for C++ FFI boundary safety through exception handling and RAII patterns.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for error handling patterns in the implementation
echo "Searching for error handling in implementation..."
rg -A 5 "throw|error|exception" --type cpp
# Look for memory safety patterns
echo "Checking memory safety patterns..."
ast-grep --pattern 'try { $$$ } catch { $$$ }'
Length of output: 10412
Script:
#!/bin/bash
# Check for unique_ptr usage and ownership transfer
echo "Checking smart pointer patterns..."
rg "unique_ptr|shared_ptr|make_unique" --type cpp -A 3
# Look for move semantics and RAII patterns
echo "Checking RAII and move semantics..."
ast-grep --pattern 'std::move($_)'
Length of output: 3823
Description
Extracts a base class for IRv1 and IRv2 stream readers to inherit from. Since IRv2 is not yet supported by clp-ffi-js, the repo now includes a base class and a derived class for IRv1.
Code was originally written by Junhao. He made these changes
I merged main afterwards, and made small changes like exporting base class methods instead of derived class.
Validation performed
Build and test functionality in log viewer (opening a few files and filtering). FYI since the exported class name changed, we will need to update log viewer when this is re-released.
Summary by CodeRabbit
New Features
IrStreamReader
class for reading and processing IR streams.Bug Fixes
Refactor
StreamReader
class to improve memory management and extensibility.Documentation
Summary by CodeRabbit
New Features
UnstructuredIrStreamReader
class for handling unstructured IR streams, including methods for filtering log events and deserializing data.Refactor
StreamReader
class, focusing on smart pointer management and interface simplification.Chores