Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for deserializing and decoding v0.1.0 IR streams, but without log-level parsing and filtering. #30

Merged
merged 43 commits into from
Nov 9, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
fc08027
Add StructuredIrStreamReader.
junhaoliao Nov 7, 2024
8d20583
Reformat; Add missing files to VCS.
junhaoliao Nov 7, 2024
4757bd4
Fix clp submodule commit id.
junhaoliao Nov 7, 2024
12eeca6
Change log level return type int64_t -> uint8_t.
junhaoliao Nov 7, 2024
d870b19
update clp commit id.
junhaoliao Nov 7, 2024
edda4b3
Use version checking utility from CLP FFI.
junhaoliao Nov 7, 2024
8a1e9a6
Add docstrings for class IrUnitHandler.
junhaoliao Nov 7, 2024
5f005e7
Correct ReaderOptions TS type.
junhaoliao Nov 7, 2024
9581578
Swap timestamp & log_level return order.
junhaoliao Nov 7, 2024
dc652ad
Return IR version via `get_ir_protocol_error_code`.
junhaoliao Nov 7, 2024
7d1fa0e
Fix lint.
junhaoliao Nov 7, 2024
c675c61
Do not mark schema_tree_node_locator [[maybe_unused]] - Apply suggest…
junhaoliao Nov 7, 2024
5c23182
Remove unused using string_literals - Apply suggestions from code review
junhaoliao Nov 7, 2024
a4c2c07
Use braces for declarations - Apply suggestions from code review
junhaoliao Nov 7, 2024
b940296
Use braces for declarations - Apply suggestions from code review
junhaoliao Nov 7, 2024
1087f94
Error messages - Apply suggestions from code review
junhaoliao Nov 7, 2024
5e8d42a
Remove dead code - Apply suggestions from code review
junhaoliao Nov 7, 2024
570b8eb
Add clp/ prefix in includes - Apply suggestions from code review
junhaoliao Nov 7, 2024
59b163b
Docs - Apply suggestions from code review
junhaoliao Nov 7, 2024
92e44b3
Add docs about potential shared_ptr replacement with gsl::not_null - …
junhaoliao Nov 7, 2024
d170434
Rename get_ir_protocol_error_code -> get_ir_stream_type; return IrStr…
junhaoliao Nov 7, 2024
b110479
Move version_validation_result into try {}; update docs.
junhaoliao Nov 7, 2024
181206e
Remove redundant docs for get_deserializer and get_reader.
junhaoliao Nov 7, 2024
cac21ee
Rename cLogLevelFilteringNotSupportedPrompt -> cLogLevelFilteringNotS…
junhaoliao Nov 7, 2024
b6d837c
Add const for cReaderOptionsLogLevelKey and cReaderOptionsTimestampKey.
junhaoliao Nov 7, 2024
11b347b
Use const instead of magic strings.
junhaoliao Nov 7, 2024
6588d56
Reformat code.
junhaoliao Nov 7, 2024
2bca1cc
Rename `json` -> `json_result`.
junhaoliao Nov 7, 2024
94f31af
Reorder member variable initializations to match their declaration or…
junhaoliao Nov 7, 2024
072c7bc
Rename parsed_tree_node_id_t -> schema_tree_node_id_t.
junhaoliao Nov 7, 2024
08e0bdd
Rename level_node_id -> log_level_node_id.
junhaoliao Nov 7, 2024
95b8102
Move StructuredIrStreamReader::create declaration to the top of the c…
junhaoliao Nov 7, 2024
37e6f30
Remove all log level handling in StructuredIrStreamReader.
junhaoliao Nov 7, 2024
058ec78
Type check timestamp value before calling get_immutable_view.
junhaoliao Nov 7, 2024
4d2c076
Put empty json string when unable to decode.
junhaoliao Nov 8, 2024
49a72c3
Lint.
junhaoliao Nov 8, 2024
12dfb9e
Rename `IrStreamType` -> `StreamType` in the `ir` namespace.
junhaoliao Nov 8, 2024
2df4df8
Use anonymous namespace for internal linkages for const strings.
junhaoliao Nov 8, 2024
7edb19e
Improve error message when serialize_to_json fails - Apply suggestion…
junhaoliao Nov 8, 2024
77e2b2e
Fix sentence case in docs - Apply suggestions from code review
junhaoliao Nov 8, 2024
320f1ea
Update clp commit id which has https://github.com/y-scope/clp/pull/57…
junhaoliao Nov 8, 2024
81135fa
lint.
junhaoliao Nov 8, 2024
cca920a
Nest anonymous namespace into clp_ffi_js::ir to reduce scope of const…
junhaoliao Nov 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,16 @@ endif()

set(CLP_FFI_JS_SRC_MAIN
src/clp_ffi_js/ir/StreamReader.cpp
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp
)

set(CLP_FFI_JS_SRC_CLP_CORE
src/submodules/clp/components/core/src/clp/ffi/ir_stream/decoding_methods.cpp
src/submodules/clp/components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp
src/submodules/clp/components/core/src/clp/ffi/ir_stream/utils.cpp
src/submodules/clp/components/core/src/clp/ffi/KeyValuePairLogEvent.cpp
src/submodules/clp/components/core/src/clp/ffi/SchemaTree.cpp
src/submodules/clp/components/core/src/clp/ir/EncodedTextAst.cpp
src/submodules/clp/components/core/src/clp/ir/LogEventDeserializer.cpp
src/submodules/clp/components/core/src/clp/ReadOnlyMemoryMappedFile.cpp
Expand Down
57 changes: 42 additions & 15 deletions src/clp_ffi_js/ir/StreamReader.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
#include "StreamReader.hpp"

#include <algorithm>
#include <cstddef>
#include <cstdint>
#include <format>
Expand All @@ -23,6 +22,7 @@
#include <spdlog/spdlog.h>

#include <clp_ffi_js/ClpFfiJsException.hpp>
#include <clp_ffi_js/ir/StructuredIrStreamReader.hpp>
#include <clp_ffi_js/ir/UnstructuredIrStreamReader.hpp>

namespace {
Expand Down Expand Up @@ -117,8 +117,17 @@ EMSCRIPTEN_BINDINGS(ClpStreamReader) {
// JS types used as inputs
emscripten::register_type<clp_ffi_js::ir::DataArrayTsType>("Uint8Array");
emscripten::register_type<clp_ffi_js::ir::LogLevelFilterTsType>("number[] | null");
emscripten::register_type<clp_ffi_js::ir::ReaderOptions>(
"{logLevelKey: string, timestampKey: string} | null"
);

// JS types used as outputs
emscripten::enum_<clp::ffi::ir_stream::IRProtocolErrorCode>("IRProtocolErrorCode")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we end up exposing IRProtocolErrorCode, let's name it IrProtocolErrorCode. I think it's better to avoid propagating the naming errors from CLP.

.value("SUPPORTED", clp::ffi::ir_stream::IRProtocolErrorCode::Supported)
.value("BACKWARD_COMPATIBLE",
clp::ffi::ir_stream::IRProtocolErrorCode::BackwardCompatible)
.value("UNSUPPORTED", clp::ffi::ir_stream::IRProtocolErrorCode::Unsupported)
.value("INVALID", clp::ffi::ir_stream::IRProtocolErrorCode::Invalid);
emscripten::register_type<clp_ffi_js::ir::DecodedResultsTsType>(
"Array<[string, number, number, number]>"
);
Expand All @@ -128,6 +137,10 @@ EMSCRIPTEN_BINDINGS(ClpStreamReader) {
&clp_ffi_js::ir::StreamReader::create,
emscripten::return_value_policy::take_ownership()
)
.function(
"getIrProtocolErrorCode",
&clp_ffi_js::ir::StreamReader::get_ir_protocol_error_code
)
.function(
"getNumEventsBuffered",
&clp_ffi_js::ir::StreamReader::get_num_events_buffered
Expand All @@ -143,7 +156,8 @@ EMSCRIPTEN_BINDINGS(ClpStreamReader) {
} // namespace

namespace clp_ffi_js::ir {
auto StreamReader::create(DataArrayTsType const& data_array) -> std::unique_ptr<StreamReader> {
auto StreamReader::create(DataArrayTsType const& data_array, ReaderOptions const& reader_options)
-> std::unique_ptr<StreamReader> {
Comment on lines +151 to +152
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Potential API Breaking Change in StreamReader::create Signature

The create method of StreamReader now requires an additional parameter, ReaderOptions. This change may break existing code that calls StreamReader::create without the new parameter. Consider providing a default value or an overloaded method to maintain backward compatibility.

Apply this diff to add a default parameter value:

-auto StreamReader::create(DataArrayTsType const& data_array, ReaderOptions const& reader_options)
+auto StreamReader::create(DataArrayTsType const& data_array, ReaderOptions const& reader_options = {})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto StreamReader::create(DataArrayTsType const& data_array, ReaderOptions const& reader_options)
-> std::unique_ptr<StreamReader> {
auto StreamReader::create(DataArrayTsType const& data_array, ReaderOptions const& reader_options = {})
-> std::unique_ptr<StreamReader> {

auto const length{data_array["length"].as<size_t>()};
SPDLOG_INFO("StreamReader::create: got buffer of length={}", length);

Expand All @@ -162,17 +176,27 @@ auto StreamReader::create(DataArrayTsType const& data_array) -> std::unique_ptr<
// Validate the stream's version
auto pos = zstd_decompressor->get_pos();
auto const version{get_version(*zstd_decompressor)};
if (std::ranges::find(cUnstructuredIrVersions, version) == cUnstructuredIrVersions.end()) {
throw ClpFfiJsException{
clp::ErrorCode::ErrorCode_Unsupported,
__FILENAME__,
__LINE__,
std::format("Unable to create reader for IR stream with version {}.", version)
};
}
auto const version_validation_result{clp::ffi::ir_stream::validate_protocol_version(version)};
Copy link

@coderabbitai coderabbitai bot Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Function validate_protocol_version appears to be missing from the codebase

The function clp::ffi::ir_stream::validate_protocol_version is being called in StreamReader.cpp, but after thorough searching through the codebase:

  • No function definition was found
  • No header file contains the declaration
  • Only usage appears in StreamReader.cpp

This confirms the original concern that this function is not yet available in the CLP-core library.

🔗 Analysis chain

Verify the existence of validate_protocol_version function

Please confirm that the function clp::ffi::ir_stream::validate_protocol_version is available in the CLP-core library. According to previous learnings, this utility function was not planned to be added soon.

Run the following script to check for the definition of validate_protocol_version:


Let me search for more specific information about this function's definition.


Let me try one more search with different parameters to find any related header files or function definitions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for the definition of `validate_protocol_version` in the codebase.

rg 'validate_protocol_version'

Length of output: 164


Script:

#!/bin/bash
# Search for the function definition using ast-grep
ast-grep --pattern 'validate_protocol_version'

# Also check header files that might contain the declaration
rg -t hpp -t h 'validate_protocol_version'

# Look for any namespace definitions related to ir_stream
rg 'namespace.*ir_stream'

Length of output: 283


Script:

#!/bin/bash
# Search for header files with .h extension
fd -e h -x rg -l "validate_protocol_version" {} \;

# Search for any ir_stream related files
fd -t f -x rg -l "namespace.*ir_stream|ir_stream" {} \;

# Look for the function definition in cpp files
fd -e cpp -x rg -B 2 -A 5 "validate_protocol_version" {} \;

Length of output: 784

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is imported from a CLP upstream library.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junhaoliao, thank you for the clarification. I understand that validate_protocol_version is imported from the CLP upstream library.


✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#30
File: src/clp_ffi_js/ir/StreamReader.cpp:179-179
Timestamp: 2024-11-07T08:38:39.605Z
Learning: The function `clp::ffi::ir_stream::validate_protocol_version` used in `src/clp_ffi_js/ir/StreamReader.cpp` is imported from the CLP upstream library and may not be present in the local codebase.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can move this into the try to reduce its scope.


junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
try {
zstd_decompressor->seek_from_begin(pos);
} catch (ZstdDecompressor::OperationFailed& e) {
if (clp::ffi::ir_stream::IRProtocolErrorCode::Supported == version_validation_result) {
zstd_decompressor->seek_from_begin(0);
return std::make_unique<StructuredIrStreamReader>(StructuredIrStreamReader::create(
std::move(zstd_decompressor),
std::move(data_buffer),
reader_options
));
}
if (clp::ffi::ir_stream::IRProtocolErrorCode::BackwardCompatible
== version_validation_result)
{
zstd_decompressor->seek_from_begin(pos);
return std::make_unique<UnstructuredIrStreamReader>(UnstructuredIrStreamReader::create(
std::move(zstd_decompressor),
std::move(data_buffer)
));
}
} catch (ZstdDecompressor::OperationFailed const& e) {
Comment on lines +174 to +191
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ensure consistent use of ReaderOptions

While StructuredIrStreamReader::create is called with reader_options, the UnstructuredIrStreamReader::create is called without it. If ReaderOptions are applicable to both reader types, consider passing reader_options to UnstructuredIrStreamReader::create as well for consistency.

throw ClpFfiJsException{
clp::ErrorCode::ErrorCode_Failure,
__FILENAME__,
Expand All @@ -181,8 +205,11 @@ auto StreamReader::create(DataArrayTsType const& data_array) -> std::unique_ptr<
};
}

return std::make_unique<UnstructuredIrStreamReader>(
UnstructuredIrStreamReader::create(std::move(zstd_decompressor), std::move(data_buffer))
);
throw ClpFfiJsException{
clp::ErrorCode::ErrorCode_Unsupported,
__FILENAME__,
__LINE__,
std::format("Unable to create reader for IR stream with version {}.", version)
};
}
} // namespace clp_ffi_js::ir
14 changes: 8 additions & 6 deletions src/clp_ffi_js/ir/StreamReader.hpp
Original file line number Diff line number Diff line change
@@ -1,26 +1,23 @@
#ifndef CLP_FFI_JS_IR_STREAMREADER_HPP
#define CLP_FFI_JS_IR_STREAMREADER_HPP

#include <array>
#include <cstddef>
#include <memory>
#include <string_view>

#include <clp/ffi/ir_stream/decoding_methods.hpp>
#include <clp/streaming_compression/zstd/Decompressor.hpp>
#include <emscripten/val.h>

namespace clp_ffi_js::ir {
// JS types used as inputs
EMSCRIPTEN_DECLARE_VAL_TYPE(DataArrayTsType);
EMSCRIPTEN_DECLARE_VAL_TYPE(LogLevelFilterTsType);
EMSCRIPTEN_DECLARE_VAL_TYPE(ReaderOptions);

// JS types used as outputs
EMSCRIPTEN_DECLARE_VAL_TYPE(DecodedResultsTsType);
EMSCRIPTEN_DECLARE_VAL_TYPE(FilteredLogEventMapTsType);

constexpr std::array<std::string_view, 6> cUnstructuredIrVersions
= {"v0.0.2", "v0.0.1", "v0.0.0", "0.0.2", "0.0.1", "0.0.0"};

/**
* Class to deserialize and decode Zstandard-compressed CLP IR streams as well as format decoded
* log events.
Expand All @@ -36,7 +33,9 @@ class StreamReader {
* @return The created instance.
* @throw ClpFfiJsException if any error occurs.
*/
[[nodiscard]] static auto create(DataArrayTsType const& data_array
[[nodiscard]] static auto create(
DataArrayTsType const& data_array,
ReaderOptions const& reader_options
) -> std::unique_ptr<StreamReader>;

// Destructor
Expand All @@ -52,6 +51,9 @@ class StreamReader {
auto operator=(StreamReader&&) -> StreamReader& = delete;

// Methods
[[nodiscard]] virtual auto get_ir_protocol_error_code(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to expose this method? If it's just so that the caller can determine what type of IR stream we're reading, then I think it's clearer to return some kind of enum of Structured and Unstructured (and call the method something like get_ir_stream_type).

Copy link
Contributor

@davemarco davemarco Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kirkrodrigues this is a hack since we are not decoding the message in C++. The log viewer needs to know now if it is recieving json (irv2), or an irv1 message. If recieving json, it needs to format the string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. I'd say an enum indicating that is still clearer than returning the protocol error code.

) const -> clp::ffi::ir_stream::IRProtocolErrorCode = 0;

/**
* @return The number of events buffered.
*/
Expand Down
5 changes: 5 additions & 0 deletions src/clp_ffi_js/ir/StreamReaderDataContext.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ class StreamReaderDataContext {
*/
[[nodiscard]] auto get_deserializer() -> Deserializer& { return m_deserializer; }

/**
* @return A reference to the reader.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unnecessary. Same with the get_deserializer docstring.

[[nodiscard]] auto get_reader() -> clp::ReaderInterface& { return *m_reader; }

private:
clp::Array<char> m_data_buffer;
std::unique_ptr<clp::ReaderInterface> m_reader;
Expand Down
198 changes: 198 additions & 0 deletions src/clp_ffi_js/ir/StructuredIrStreamReader.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
#include "StructuredIrStreamReader.hpp"

#include <cstddef>
#include <format>
#include <memory>
#include <string>
#include <string_view>
#include <system_error>
#include <utility>
#include <vector>

#include <clp/Array.hpp>
#include <clp/ErrorCode.hpp>
#include <clp/ffi/ir_stream/Deserializer.hpp>
#include <clp/ffi/KeyValuePairLogEvent.hpp>
#include <clp/ffi/Value.hpp>
#include <clp/ir/types.hpp>
#include <clp/TraceableException.hpp>
#include <emscripten/em_asm.h>
#include <emscripten/val.h>
#include <spdlog/spdlog.h>

#include <clp_ffi_js/ClpFfiJsException.hpp>
#include <clp_ffi_js/constants.hpp>
#include <clp_ffi_js/ir/StreamReader.hpp>
#include <clp_ffi_js/ir/StreamReaderDataContext.hpp>

namespace clp_ffi_js::ir {

using namespace std::literals::string_literals;
junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
using clp::ir::four_byte_encoded_variable_t;

constexpr std::string_view cLogLevelFilteringNotSupportedPrompt{
"Log level filtering is not yet supported in this reader."
};

auto StructuredIrStreamReader::create(
std::unique_ptr<ZstdDecompressor>&& zstd_decompressor,
clp::Array<char> data_array,
ReaderOptions const& reader_options
) -> StructuredIrStreamReader {
auto deserialized_log_events{std::make_shared<std::vector<clp::ffi::KeyValuePairLogEvent>>()};
auto result{StructuredIrDeserializer::create(
*zstd_decompressor,
IrUnitHandler(
deserialized_log_events,
reader_options["logLevelKey"].as<std::string>(),
reader_options["timestampKey"].as<std::string>()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use constants rather than magic strings.

)
junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
)};
if (result.has_error()) {
auto const error_code{result.error()};
throw ClpFfiJsException{
clp::ErrorCode::ErrorCode_Failure,
__FILENAME__,
__LINE__,
std::format(
"Failed to create deserializer: {} {}",
error_code.category().name(),
error_code.message()
)
};
}
auto data_context = StreamReaderDataContext<StructuredIrDeserializer>(
std::move(data_array),
std::move(zstd_decompressor),
std::move(result.value())
);
junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
return StructuredIrStreamReader(std::move(data_context), std::move(deserialized_log_events));
junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
}

auto StructuredIrStreamReader::get_num_events_buffered() const -> size_t {
return m_deserialized_log_events->size();
}

auto StructuredIrStreamReader::get_filtered_log_event_map() const -> FilteredLogEventMapTsType {
SPDLOG_ERROR(cLogLevelFilteringNotSupportedPrompt);
return FilteredLogEventMapTsType{emscripten::val::null()};
}

void StructuredIrStreamReader::filter_log_events(LogLevelFilterTsType const& log_level_filter) {
if (log_level_filter.isNull()) {
return;
}
SPDLOG_ERROR(cLogLevelFilteringNotSupportedPrompt);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance error handling in filter_log_events method

Instead of logging an error when log level filtering is unsupported, you might throw a ClpFfiJsException. This approach would more effectively notify the caller that the operation is not available, allowing for better error management in the application.


auto StructuredIrStreamReader::deserialize_stream() -> size_t {
if (nullptr == m_stream_reader_data_context) {
return m_deserialized_log_events->size();
}

constexpr size_t cDefaultNumReservedLogEvents{500'000};
m_deserialized_log_events->reserve(cDefaultNumReservedLogEvents);
auto& reader{m_stream_reader_data_context->get_reader()};
while (true) {
auto result{m_stream_reader_data_context->get_deserializer().deserialize_next_ir_unit(reader
)};
if (false == result.has_error()) {
continue;
}
auto const error{result.error()};
if (std::errc::no_message_available == error || std::errc::operation_not_permitted == error)
{
break;
}
if (std::errc::result_out_of_range == error) {
SPDLOG_ERROR("File contains an incomplete IR stream");
break;
}
Comment on lines +107 to +110
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Throw exception for incomplete IR streams

When encountering an incomplete IR stream, the code logs an error and exits the loop. Consider throwing a ClpFfiJsException to inform the caller about the incomplete stream explicitly. This change would improve error propagation and allow for more robust error handling upstream.

throw ClpFfiJsException{
clp::ErrorCode::ErrorCode_Corrupt,
__FILENAME__,
__LINE__,
std::format(
"Failed to deserialize: {}:{}",
junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
error.category().name(),
error.message()
)
};
}
m_level_node_id = m_stream_reader_data_context->get_deserializer()
.get_ir_unit_handler()
.get_level_node_id();
m_timestamp_node_id = m_stream_reader_data_context->get_deserializer()
.get_ir_unit_handler()
.get_timestamp_node_id();
m_stream_reader_data_context.reset(nullptr);
return m_deserialized_log_events->size();
}

auto StructuredIrStreamReader::decode_range(size_t begin_idx, size_t end_idx, bool use_filter) const
-> DecodedResultsTsType {
if (use_filter) {
SPDLOG_ERROR(cLogLevelFilteringNotSupportedPrompt);
return DecodedResultsTsType{emscripten::val::null()};
}

if (m_deserialized_log_events->size() < end_idx || begin_idx > end_idx) {
return DecodedResultsTsType{emscripten::val::null()};
}

std::string message;
constexpr size_t cDefaultReservedMessageLength{512};
message.reserve(cDefaultReservedMessageLength);
junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
auto const results{emscripten::val::array()};

for (size_t log_event_idx = begin_idx; log_event_idx < end_idx; ++log_event_idx) {
auto const& log_event{m_deserialized_log_events->at(log_event_idx)};

auto const json{log_event.serialize_to_json()};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about json_result?

if (false == json.has_value()) {
SPDLOG_ERROR("Failed to decode message.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's log the error info.

junhaoliao marked this conversation as resolved.
Show resolved Hide resolved
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we continue instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's instead place an empty object.

}

auto const& id_value_pairs{log_event.get_node_id_value_pairs()};
LogLevel log_level{LogLevel::NONE};
if (m_level_node_id.has_value()) {
auto const& log_level_pair{id_value_pairs.at(m_level_node_id.value())};
log_level = log_level_pair.has_value()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • If the log level isn't an int, get_immutable_view will throw, right? We should check that the value's type matches the expected type first (using is) and then retrieve the value.
  • If the value's type is a string, we should compare it against the log level strings, right?
  • Do we need the ternary? Personally, I would write this as: if has_value, set log_level.

Copy link
Collaborator Author

@junhaoliao junhaoliao Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. As we discussed offline, let's leave log level handling till another PR. Let me apply the changes for timestamp after I remove log level related handling.

? static_cast<LogLevel>(
log_level_pair.value()
.get_immutable_view<clp::ffi::value_int_t>()
)
: log_level;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Prevent potential exceptions when accessing id_value_pairs with missing keys

Using id_value_pairs.at() can throw an std::out_of_range exception if the key is not found. To ensure robustness, consider checking if the key exists before accessing it.

Apply this diff to safely access the map:

At lines 160-167:

-    auto const& log_level_pair{id_value_pairs.at(m_level_node_id.value())};
-    log_level = log_level_pair.has_value()
-                    ? static_cast<LogLevel>(
-                              log_level_pair.value()
-                                      .get_immutable_view<clp::ffi::value_int_t>()
-                      )
-                    : log_level;
+    auto log_level_it = id_value_pairs.find(m_level_node_id.value());
+    if (log_level_it != id_value_pairs.end()) {
+        auto const& log_level_pair = log_level_it->second;
+        if (log_level_pair.has_value()) {
+            log_level = static_cast<LogLevel>(
+                log_level_pair.value().get_immutable_view<clp::ffi::value_int_t>()
+            );
+        }
+    }

At lines 170-174:

-    auto const& timestamp_pair{id_value_pairs.at(m_timestamp_node_id.value())};
-    timestamp = timestamp_pair.has_value()
-                        ? timestamp_pair.value().get_immutable_view<clp::ffi::value_int_t>()
-                        : timestamp;
+    auto timestamp_it = id_value_pairs.find(m_timestamp_node_id.value());
+    if (timestamp_it != id_value_pairs.end()) {
+        auto const& timestamp_pair = timestamp_it->second;
+        if (timestamp_pair.has_value()) {
+            timestamp = timestamp_pair.value().get_immutable_view<clp::ffi::value_int_t>();
+        }
+    }

Also applies to: 170-174

clp::ffi::value_int_t timestamp{0};
if (m_timestamp_node_id.has_value()) {
auto const& timestamp_pair{id_value_pairs.at(m_timestamp_node_id.value())};
timestamp = timestamp_pair.has_value()
? timestamp_pair.value().get_immutable_view<clp::ffi::value_int_t>()
: timestamp;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, we need to check the type and the ternary seems unnecessary.

}

EM_ASM(
{ Emval.toValue($0).push([UTF8ToString($1), $2, $3, $4]); },
results.as_handle(),
json.value().dump().c_str(),
timestamp,
log_level,
log_event_idx + 1
);
}

return DecodedResultsTsType(results);
}

StructuredIrStreamReader::StructuredIrStreamReader(
StreamReaderDataContext<StructuredIrDeserializer>&& stream_reader_data_context,
std::shared_ptr<std::vector<clp::ffi::KeyValuePairLogEvent>> deserialized_log_events
)
: m_stream_reader_data_context{std::make_unique<
StreamReaderDataContext<StructuredIrDeserializer>>(
std::move(stream_reader_data_context)
)},
m_deserialized_log_events{std::move(deserialized_log_events)} {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this first to match the declaration order of member variables.

} // namespace clp_ffi_js::ir
Loading
Loading