-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core-clp): Add LZMA Compressor
implementation and LZMA dependency.
#614
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request introduces support for LZMA compression within the project. It updates the Changes
Possibly related PRs
Suggested reviewers
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
I guess two comments:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
🧹 Outside diff range and nitpick comments (18)
components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh (1)
23-23
: Consider pinning the package version for reproducibility.To ensure consistent builds across different environments and times, consider pinning the package version.
- liblzma-dev \ + liblzma-dev=5.2.5-2ubuntu1 \components/core/tools/scripts/lib_install/ubuntu-jammy/install-packages-from-source.sh (2)
Line range hint
1-20
: Consider adding version compatibility commentsThe script structure is good, but it would be helpful to document any version interdependencies between the libraries.
Add a comment block at the top of the file:
#!/usr/bin/env bash +# Library versions and compatibility notes: +# - liblzma 5.4.6: Compatible with libarchive 3.5.1 + # Exit on any error set -e
14-14
: Consider error handling for LZMA installationWhile the script has basic error handling with
set -e
, consider adding specific error handling for the LZMA installation.-"$lib_install_scripts_dir"/liblzma.sh 5.4.6 +if ! "$lib_install_scripts_dir"/liblzma.sh 5.4.6; then + echo "Error: LZMA installation failed. Please check system requirements." + exit 1 +ficomponents/core/tools/scripts/lib_install/ubuntu-focal/install-packages-from-source.sh (1)
17-17
: Consider installation order implicationsThe LZMA library is installed after libarchive. If libarchive is built with LZMA support, it might need to be rebuilt after LZMA installation.
Consider either:
- Moving the LZMA installation before libarchive
- Adding a rebuild step for libarchive after LZMA installation if necessary
components/core/tools/scripts/lib_install/liblzma.sh (1)
1-66
: Consider adding logging, dry-run mode, and uninstall capabilityTo improve the script's maintainability and user experience, consider these architectural improvements:
- Add logging functionality with different verbosity levels
- Implement a dry-run mode for testing
- Add uninstall capability
- Consider adding a progress indicator for long-running operations
These improvements would make the script more robust and user-friendly for both development and production use.
Would you like me to provide an implementation for any of these features?
components/core/tests/test-StreamingCompression.cpp (2)
60-64
: Consider enhancing test coverage for LZMA-specific scenarios.While the basic compression/decompression test is good, consider adding LZMA-specific test cases to validate:
- Different compression levels
- Dictionary size variations
- Error handling for corrupted LZMA streams
Would you like me to provide example test cases for these scenarios?
Line range hint
1-113
: Consider refactoring to use parameterized tests.The test structure is duplicated across compression types. Consider using Catch2's test case templates to reduce duplication and make it easier to add new compression methods.
Here's a suggested approach:
TEMPLATE_TEST_CASE("StreamingCompression", "[StreamingCompression]", std::pair<clp::streaming_compression::zstd::Compressor, clp::streaming_compression::zstd::Decompressor>, std::pair<clp::streaming_compression::passthrough::Compressor, clp::streaming_compression::passthrough::Decompressor>, std::pair<clp::streaming_compression::lzma::Compressor, clp::streaming_compression::lzma::Decompressor> ) { auto compressor = std::make_unique<typename TestType::first_type>(); auto decompressor = std::make_unique<typename TestType::second_type>(); // ... rest of the test ... }components/core/CMakeLists.txt (4)
21-23
: Standardize the status message formatThe status message format differs from other compressors. Consider using a consistent format:
- message(STATUS "Using Lempel–Ziv–Markov chain Algorithm compression") + message(STATUS "Using LZMA compression")
241-241
: Use STATUS for LZMA include directory messageThe include directory message should use the STATUS level for consistency with other similar messages:
-message("LZMA Include Dir: ${LIBLZMA_INCLUDE_DIRS}") +message(STATUS "LZMA Include Dir: ${LIBLZMA_INCLUDE_DIRS}")
231-232
: Address the static/shared library support TODOThe TODO comment indicates missing functionality for handling static vs. shared libraries.
Would you like me to help create a CMake module to handle static/shared library selection for LZMA? This would align with the project's existing pattern for other libraries.
573-573
: Consider grouping compression-related libraries togetherFor better organization, consider grouping LZMA with other compression-related libraries:
- ${LIBLZMA_LIBRARIES} - ZStd::ZStd + ZStd::ZStd + ${LIBLZMA_LIBRARIES}components/core/src/clp/streaming_compression/lzma/Compressor.hpp (1)
120-120
: Use type aliasLzmaAction
for consistency.In the
compress
method, consider using the type aliasLzmaAction
instead oflzma_action
to maintain consistency with the defined type aliases and improve readability.Apply this diff for consistency:
-auto compress(lzma_action action) -> void; +auto compress(LzmaAction action) -> void;components/core/src/clp/streaming_compression/lzma/Decompressor.hpp (3)
9-9
: Remove unnecessary inclusion of<zlib.h>
.The
<zlib.h>
header included at line 9 may be unnecessary since this decompressor uses LZMA compression. Eliminating unused headers can reduce dependencies and improve compile times.Apply this diff to remove the unnecessary include:
-#include <zlib.h>
61-62
: Complete or remove the incomplete Doxygen comment.There is an incomplete Doxygen comment starting at line 61 without any description. To maintain clear documentation, please complete the comment or remove it if it's not needed.
Apply this diff to complete the comment:
/** + * Reads an exact number of bytes from the decompressor into the buffer. + * @param buf The buffer to read data into. + * @param num_bytes_to_read The exact number of bytes to read. + * @param num_bytes_read The actual number of bytes read. */ void exact_read(char* buf, size_t num_bytes_to_read, size_t& num_bytes_read);
89-90
: Correct the reference to the appropriate method in the comment.The comment at line 89 incorrectly references
streaming_compression::zstd::Decompressor::try_seek_from_begin
, which is not relevant to the LZMA decompressor. It should refer to the appropriate LZMA method or be updated to accurately reflect the functionality.Apply this diff to correct the comment:
- * @return Same as streaming_compression::zstd::Decompressor::try_seek_from_begin + * @return Same as Decompressor::try_seek_from_begincomponents/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)
86-86
: UseLZMA_STREAM_INIT
for proper initialization ofm_compression_stream
Instead of using
memset
to initializem_compression_stream
, it's recommended to use theLZMA_STREAM_INIT
macro provided by the LZMA library. This ensures all fields are correctly initialized, including any internal pointers, and is the preferred method according to the LZMA documentation.Apply this diff to initialize
m_compression_stream
correctly:- memset(m_compression_stream.get(), 0, sizeof(LzmaStream)); + *m_compression_stream = LZMA_STREAM_INIT;components/core/src/clp/streaming_compression/lzma/Decompressor.cpp (2)
113-113
: Correct error messages to reference LZMA functionsThe error messages incorrectly mention
inflate()
, which is associated with zlib. Since we're using LZMA, please update the messages to referencelzma_code()
to accurately reflect the context.Apply this diff to fix the error messages:
- SPDLOG_ERROR("streaming_compression::lzma::Decompressor inflate() ran out of memory"); + SPDLOG_ERROR("streaming_compression::lzma::Decompressor lzma_code() ran out of memory"); ... - SPDLOG_ERROR("inflate() returned an unexpected value - {}.", int(return_value)); + SPDLOG_ERROR("lzma_code() returned an unexpected value - {}.", int(return_value));Also applies to: 117-117
130-130
: Update comment to reflect LZMA instead of ZStdThe comment references 'ZStd', but this is an LZMA decompressor. Please correct the comment to avoid confusion.
Apply this diff to fix the comment:
- // ZStd has no way for us to seek back to the desired position, so just reset the stream + // LZMA has no way for us to seek back to the desired position, so just reset the stream
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
components/core/.clang-format
(1 hunks)components/core/CMakeLists.txt
(4 hunks)components/core/src/clp/streaming_compression/Constants.hpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.hpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Constants.hpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Decompressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Decompressor.hpp
(1 hunks)components/core/tests/test-StreamingCompression.cpp
(2 hunks)components/core/tools/scripts/lib_install/liblzma.sh
(1 hunks)components/core/tools/scripts/lib_install/ubuntu-focal/install-packages-from-source.sh
(1 hunks)components/core/tools/scripts/lib_install/ubuntu-focal/install-prebuilt-packages.sh
(1 hunks)components/core/tools/scripts/lib_install/ubuntu-jammy/install-packages-from-source.sh
(1 hunks)components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- components/core/.clang-format
- components/core/src/clp/streaming_compression/lzma/Constants.hpp
🧰 Additional context used
🪛 cppcheck (2.10-2)
components/core/src/clp/streaming_compression/lzma/Decompressor.cpp
[error] 41-41: Rethrowing current exception with 'throw;', it seems there is no current exception to rethrow. If there is no current exception this calls std
(rethrowNoCurrentException)
[error] 45-45: Rethrowing current exception with 'throw;', it seems there is no current exception to rethrow. If there is no current exception this calls std
(rethrowNoCurrentException)
🔇 Additional comments (11)
components/core/src/clp/streaming_compression/Constants.hpp (2)
10-10
: Well-structured enum value addition!
The chosen value (0x20) follows good practices:
- Maintains consistent power-of-2 spacing with existing values
- Leaves room for future compression types between values
- Follows the established naming convention
10-10
: Verify enum value handling across the codebase
Let's ensure the new LZMA enum value is properly handled in all relevant code paths.
✅ Verification successful
Let me gather more information about the CompressorType enum usage and switch statements.
Based on the comprehensive analysis of the codebase, I can now provide a final response:
LZMA enum value is properly integrated and handled
The new LZMA enum value (0x20) is correctly integrated into the codebase with proper handling:
- The CompressorType enum is consistently defined across namespaces (clp::streaming_compression)
- LZMA implementation follows the same pattern as existing compressors (ZSTD, Passthrough)
- No switch statements directly use CompressorType enum, as the type is handled through class inheritance
- Proper initialization of LZMA compressor/decompressor classes using CompressorType::LZMA
- Test cases include LZMA compression alongside other compression types
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find potential switch statements and enum handling code that might need updates
# Search for switch statements handling CompressorType
echo "Checking for switch statements handling CompressorType..."
rg -l "switch.*CompressorType"
# Search for serialization/deserialization code
echo "Checking for serialization/deserialization code..."
rg -l "serialize.*CompressorType|deserialize.*CompressorType"
# Look for direct CompressorType comparisons
echo "Checking for CompressorType comparisons..."
rg -l "CompressorType::(ZSTD|Passthrough)"
Length of output: 989
Script:
#!/bin/bash
# Search for CompressorType enum usage with context
echo "Checking CompressorType enum usage with context..."
rg -A 5 "CompressorType"
# Search for switch statements in compression-related files
echo "Checking switch statements in compression files..."
fd -e cpp -e hpp | grep -i "compress" | xargs rg "switch.*\{" -A 5
# Look for factory or creation methods that might use CompressorType
echo "Checking factory methods..."
rg "create.*Compressor|make.*Compressor" -A 5
Length of output: 26592
components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh (1)
23-23
: LGTM! Verify package availability in Ubuntu Jammy.
The addition of liblzma-dev is correct and properly placed in alphabetical order.
✅ Verification successful
Package addition is properly integrated and verified
The liblzma-dev package addition is well-integrated into the existing LZMA compression workflow:
- Package exists in Ubuntu Jammy repositories
- Consistent with Ubuntu Focal configuration
- Properly supports the LZMA compression option in CMakeLists.txt
- Aligns with existing LZMA build infrastructure (liblzma.sh)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify liblzma-dev package availability and its dependencies in Ubuntu Jammy
# Expected: Package should be available and not have conflicting dependencies
# Check if the package exists in Ubuntu Jammy repositories
curl -s "https://packages.ubuntu.com/jammy/liblzma-dev" | grep -q "Page not found" && echo "Package not found" || echo "Package exists"
# Get package details including version and dependencies
curl -s "https://packages.ubuntu.com/jammy/liblzma-dev" | grep -A 5 "Package:"
Length of output: 300
Script:
#!/bin/bash
# Let's verify the package dependencies and version availability more thoroughly
apt-cache policy liblzma-dev 2>/dev/null || echo "Package info not available in sandbox"
# Check if the package is already used elsewhere in the codebase
rg -l "liblzma-dev" --type sh
# Look for any LZMA related build dependencies or configurations
rg -l "lzma" --type cmake --type make --type sh
Length of output: 787
Script:
#!/bin/bash
# Let's check the CMakeLists.txt to understand LZMA integration
cat components/core/CMakeLists.txt | grep -A 5 -B 5 "lzma"
# Check the content of liblzma.sh to understand existing LZMA setup
cat components/core/tools/scripts/lib_install/liblzma.sh
# Check if the package is also used in Focal
cat components/core/tools/scripts/lib_install/ubuntu-focal/install-prebuilt-packages.sh | grep "liblzma-dev"
Length of output: 3455
components/core/tools/scripts/lib_install/ubuntu-jammy/install-packages-from-source.sh (1)
14-14
: Verify LZMA installation script and dependencies
The addition of LZMA library installation looks good, but let's verify a few important aspects:
✅ Verification successful
Let me gather more information about the liblzma.sh script content and its error handling:
LZMA installation script verified and properly configured
The installation script for LZMA 5.4.6 has been thoroughly verified and shows:
- Proper error handling with
set -e
andset -u
- Secure download from the official tukaani-project repository
- Appropriate privilege management
- CMake configuration with position-independent code
- Parallel build support
- Clean installation process
The script follows the project's established patterns and integrates well with the existing build system. The CMake configuration properly requires and validates LZMA presence.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify LZMA installation script and its dependencies
# Check if liblzma.sh exists and is executable
echo "Checking liblzma.sh existence and permissions..."
fd --type f "liblzma.sh" --exec ls -l {}
# Look for version constraints or dependencies in build files
echo -e "\nChecking for version constraints..."
rg -g "CMakeLists.txt" --multiline "find_package\(.*LZMA.*\)" -A 5
Length of output: 872
Script:
#!/bin/bash
# Check the content of liblzma.sh for error handling and version validation
echo "Checking liblzma.sh content..."
cat ./components/core/tools/scripts/lib_install/liblzma.sh
# Check if there are any version-specific dependencies or constraints in the codebase
echo -e "\nChecking for LZMA version references..."
rg "LZMA.*5\.4" --type cpp --type cmake
Length of output: 1777
components/core/tools/scripts/lib_install/ubuntu-focal/install-packages-from-source.sh (1)
17-17
: Verify LZMA version compatibility and security
Please ensure that LZMA version 5.4.6 meets the project's requirements and has no known vulnerabilities.
✅ Verification successful
Let me gather more information about LZMA usage and dependencies in the project.
LZMA 5.4.6 is compatible with the project requirements
Based on the codebase analysis:
- LZMA is used as an optional general-purpose compressor in the project
- The project uses CMake's
find_package(LibLZMA REQUIRED)
without version constraints - No known vulnerabilities were found for LZMA 5.4.6
- The installation script location appears to be incorrect as
liblzma.sh
was not found at the expected path
The version 5.4.6 is suitable for use, but please ensure the installation script is properly placed at components/core/tools/scripts/lib_install/ubuntu-focal/liblzma.sh
before proceeding.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for security advisories and compatibility issues with LZMA 5.4.6
# Check for CVEs in the National Vulnerability Database
curl -s "https://services.nvd.nist.gov/rest/json/cves/2.0?keywordSearch=liblzma%205.4.6" | jq '.vulnerabilities[].cve.descriptions[].value'
# Check if any other components in the project specify LZMA version requirements
rg -i "lzma.*version|version.*lzma" --type cmake --type=txt
Length of output: 302
Script:
#!/bin/bash
# Check for LZMA dependencies and usage
rg -i "lzma" --type cmake -A 3 -B 3
# Check the liblzma installation script
cat components/core/tools/scripts/lib_install/ubuntu-focal/liblzma.sh
# Check if there are any specific LZMA version requirements in package configs
fd -e pc -e cmake -x cat {} \; | rg -i "lzma"
Length of output: 4152
components/core/tools/scripts/lib_install/ubuntu-focal/install-prebuilt-packages.sh (2)
23-23
: LGTM! The LZMA development package is correctly added.
The addition of liblzma-dev
is properly positioned in alphabetical order and aligns with the script's existing package installation pattern.
23-23
: Consider documenting Ubuntu version compatibility.
Since this script is specifically for Ubuntu Focal (20.04), we should verify and document LZMA package compatibility for other Ubuntu versions that might be supported by the project.
components/core/tests/test-StreamingCompression.cpp (1)
18-19
: LGTM! Headers are properly organized.
The LZMA compression headers follow the established pattern and are correctly placed with other compression-related includes.
components/core/src/clp/streaming_compression/lzma/Decompressor.hpp (2)
69-69
: Verify the correctness of the method reference in the comment.
The comment at line 69 mentions ReaderInterface::try_read_exact_length
, but it's unclear if this is the correct method reference. Please ensure that this comment accurately describes the return behaviour and references the appropriate method.
146-146
:
Ensure proper initialisation and cleanup of m_decompression_stream
.
The member variable m_decompression_stream
is a raw pointer initialised to nullptr
. To prevent potential memory leaks, ensure that it is properly allocated during initialisation and that all resources are released in the destructor.
Would you like assistance in reviewing the resource management for m_decompression_stream
?
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)
20-75
: Initialization of the LZMA encoder is correctly implemented
The init_lzma_encoder
function properly initializes the LZMA encoder with the specified compression level and dictionary size. Error handling for initialization failures is thorough and follows best practices.
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Decompressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Decompressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Decompressor.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
components/core/CMakeLists.txt (3)
234-235
: Fix inconsistent variable name in status messageThe status message uses
Lzma
instead of the package nameLibLZMA
:- message(STATUS "Found Lzma ${LIBLZMA_VERSION_STRING}") + message(STATUS "Found LibLZMA ${LIBLZMA_VERSION_STRING}")
241-241
: Remove redundant debug messageThe include directory message is redundant since CMake already logs this information at a more appropriate debug level.
-message("Lzma Include Dir: ${LIBLZMA_INCLUDE_DIRS}")
231-232
: Add support for static/shared LZMA library selectionThe TODO comment indicates missing support for selecting between static and shared LZMA libraries. This should be implemented to maintain consistency with other library configurations in the project.
Would you like me to help create a FindLZMA.cmake module that supports static/shared library selection?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
components/core/CMakeLists.txt
(4 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.hpp
(1 hunks)components/core/tools/scripts/lib_install/liblzma.sh
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- components/core/src/clp/streaming_compression/lzma/Compressor.hpp
- components/core/tools/scripts/lib_install/liblzma.sh
🔇 Additional comments (3)
components/core/CMakeLists.txt (1)
481-485
: Missing test coverage for LZMA compression implementation
The LZMA source files have been added to the test suite in CMakeLists.txt, but there are no corresponding test cases in the test-StreamingCompression.cpp file. This needs to be addressed to ensure the reliability of the LZMA compression functionality.
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (2)
173-183
:
Prevent potential infinite loops by handling LZMA_BUF_ERROR
appropriately
In the compress
function, when lzma_code
returns LZMA_BUF_ERROR
, the current implementation continues the loop without taking corrective action. This could lead to an infinite loop, as LZMA_BUF_ERROR
indicates that no progress can be made due to insufficient input or output buffer space.
Consider handling LZMA_BUF_ERROR
explicitly to prevent potential infinite loops.
Apply this diff to handle LZMA_BUF_ERROR
:
switch (rc) {
case LZMA_OK:
break;
- case LZMA_BUF_ERROR:
- break;
+ case LZMA_BUF_ERROR:
+ // No progress can be made; handle or report the error
+ SPDLOG_ERROR("Compression stalled: LZMA_BUF_ERROR encountered.");
+ throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
case LZMA_STREAM_END:
hit_stream_end = true;
break;
207-213
: 🛠️ Refactor suggestion
Handle potential exceptions from m_compressed_stream_file_writer->write
Currently, the code assumes that the write
operation always succeeds. To enhance robustness, consider handling any exceptions or errors that may occur during the write operation to ensure that failures are properly reported and managed.
Apply this diff to handle possible write errors:
auto Compressor::pipe_data() -> void {
+ try {
m_compressed_stream_file_writer->write(
size_checked_pointer_cast<char>(m_compressed_stream_block_buffer.data()),
cCompressedStreamBlockBufferSize - m_compression_stream.avail_out
);
+ } catch (const std::exception& e) {
+ SPDLOG_ERROR("Failed to write compressed data: {}", e.what());
+ throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
+ }
m_compression_stream.next_out = m_compressed_stream_block_buffer.data();
m_compression_stream.avail_out = cCompressedStreamBlockBufferSize;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
components/core/tests/test-StreamingCompression.cpp (2)
59-61
: Add documentation for intentionally omitted decompressorWhile the LZMA test section follows the established pattern, it's worth documenting why the decompressor is intentionally not initialized here, especially since it differs from other test sections.
Consider adding a comment like:
SECTION("LZMA compression") { + // Note: Decompressor intentionally not initialized as it's handled separately compressor = std::make_unique<clp::streaming_compression::lzma::Compressor>(); }
82-85
: Consider adding test reporting for skipped decompressionWhile the null check correctly handles missing decompressors, it silently skips the decompression test. Consider using Catch2's
INFO
orWARN
to report when decompression is skipped.if (nullptr == decompressor) { + WARN("Skipping decompression test - decompressor not initialized"); boost::filesystem::remove(compressed_file_path); return; }
components/core/CMakeLists.txt (2)
241-241
: Remove redundant debug messageThe include directory message is redundant since CMake already logs this information when the package is found.
-message("Lzma Include Dir: ${LIBLZMA_INCLUDE_DIRS}")
231-232
: Add support for static/shared LZMA library selectionThe TODO comment indicates missing support for selecting between static and shared LZMA libraries. This should be implemented to maintain consistency with other compression libraries in the project.
Would you like me to help create a FindLibLZMA.cmake module that supports static/shared library selection?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
components/core/CMakeLists.txt
(4 hunks)components/core/tests/test-StreamingCompression.cpp
(3 hunks)
🔇 Additional comments (4)
components/core/tests/test-StreamingCompression.cpp (2)
18-18
: LGTM! Header inclusion follows project conventions
The LZMA compressor header is properly included alongside other compression-related headers.
59-61
: Verify LZMA-specific test scenarios
While the test covers basic compression functionality, consider whether LZMA-specific scenarios need testing:
- Different compression levels
- Dictionary size variations
- Error handling for invalid LZMA streams
✅ Verification successful
Let me gather more information about the LZMA implementation and test coverage.
Test coverage appears adequate for LZMA compression
The implementation already includes essential LZMA-specific configurations and error handling:
- Compression level configuration via
init_lzma_encoder
- Dictionary size customization
- Error handling for LZMA stream operations
- CRC64 checksum verification
The basic test case is sufficient as it exercises the core compression path, which internally handles these configurations. The implementation's error handling is covered through the LZMA stream workflow.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any existing LZMA-specific test cases
rg -l "LZMA" --type cpp --glob "test-*.cpp"
# Check for LZMA configuration options in source
ast-grep --pattern 'class $_ {
$$$
// LZMA specific configuration
$$$
}'
Length of output: 176
Script:
#!/bin/bash
# Check the LZMA compressor implementation and available configuration options
rg -A 10 "class.*[Ll]zma.*Compressor" --type cpp
# Look for LZMA-specific test cases in more detail
rg -B 2 -A 10 "SECTION.*LZMA" components/core/tests/test-StreamingCompression.cpp
# Search for any LZMA configuration or parameter settings
rg "lzma.*\(" --type cpp
Length of output: 2117
components/core/CMakeLists.txt (2)
235-235
: Fix incorrect variable name in status message
The status message uses Lzma
instead of the package name LibLZMA
.
481-483
: LGTM! Source files and library linking look good
The LZMA source files are properly organized in the streaming_compression directory, and the library is correctly linked in the target_link_libraries command.
Also applies to: 571-571
components/core/src/clp/streaming_compression/lzma/Compressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.hpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (3)
31-35
: Consider simplifying the comment block.The comment about XZ Embedded could be more concise while maintaining its informative value.
- // Initialize the encoder using a preset. Set the integrity to check - // to CRC64, which is the default in the xz command line tool. If - // the .xz file needs to be decompressed with XZ Embedded, use - // LZMA_CHECK_CRC32 instead. + // Initialize encoder with CRC64 integrity check (use LZMA_CHECK_CRC32 for XZ Embedded compatibility)
171-174
: Document LZMA_BUF_ERROR handling strategy.The code handles LZMA_BUF_ERROR but should document why it's treated the same as LZMA_OK.
case LZMA_OK: case LZMA_BUF_ERROR: + // LZMA_BUF_ERROR indicates temporary buffer condition, safe to continue break;
1-207
: Consider adding performance optimizations for large datasets.The current implementation could benefit from:
- Buffer size tuning based on input characteristics
- Parallel compression for large datasets
- Memory-mapped I/O for large files
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.hpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/src/clp/streaming_compression/lzma/Compressor.hpp
🔇 Additional comments (5)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (5)
1-17
: LGTM! Header organization and includes are well-structured.
The includes are logically organized into standard libraries, external dependencies, and project-specific headers.
70-93
: LGTM! Well-structured initialization with proper validation.
The method properly validates the initialization state and compression level before setting up the compression stream and buffers.
95-108
: LGTM! Proper resource cleanup and finalization.
The method ensures proper cleanup of resources and follows RAII principles.
110-131
:
Add error handling for the compression operation.
The write operation should handle potential compression failures.
run_lzma(LZMA_RUN);
+ if (!m_compression_stream_is_flushed) {
+ throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
+ }
199-206
:
Add exception handling for file write operations.
The write operation could fail and should be wrapped in a try-catch block.
auto Compressor::pipe_data() -> void {
+ try {
m_compressed_stream_file_writer->write(
clp::size_checked_pointer_cast<char>(m_compressed_stream_block_buffer.data()),
cCompressedStreamBlockBufferSize - m_compression_stream.avail_out
);
+ } catch (const std::exception& e) {
+ SPDLOG_ERROR("Failed to write compressed data: {}", e.what());
+ throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
+ }
m_compression_stream.next_out = m_compressed_stream_block_buffer.data();
m_compression_stream.avail_out = cCompressedStreamBlockBufferSize;
}
components/core/src/clp/streaming_compression/lzma/Compressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
Sorry for missing this question. Yes I think you need to look into CentOS and MacOS. It looks like you have already added the dependency for centos. Actually I would expect the build of MacOS to fail because LZMA library is not installed, and the updated Unittest should require the lzma dependency. Maybe it means the MacOS already comes with LZMA dependencies? |
components/core/src/clp/streaming_compression/lzma/Compressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
Added my last batch of comments. After you fix them, can you ask David to do a final check? |
Co-authored-by: haiqi96 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (4)
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (4)
20-30
: Add documentation for the exception class.The
OperationFailed
exception class should include documentation explaining when this exception is thrown and what the error codes represent.Add this documentation above the class declaration:
+ /** + * Exception thrown when LZMA compression operations fail + * @throws When LZMA operations return unexpected error codes or invalid states + */ class OperationFailed : public TraceableException {
47-52
: Enhance write method documentation.The documentation should specify potential exceptions and behaviour when data is null or length is zero.
Add these details to the documentation:
/** * Writes the given data to the compressor - * @param data - * @param data_length + * @param data Pointer to the data to compress + * @param data_length Length of the data in bytes + * @throw OperationFailed if LZMA compression fails + * @throw std::invalid_argument if data is null and length is non-zero */
91-91
: Document the buffer size constant choice.Add a comment explaining why 4KiB was chosen as the buffer size and any performance implications.
- static constexpr size_t cCompressedStreamBlockBufferSize{4096}; // 4KiB + // 4KiB buffer size chosen to match typical page size for optimal I/O performance + static constexpr size_t cCompressedStreamBlockBufferSize{4096};
126-126
: Document the default dictionary size.Add a comment explaining the choice of default dictionary size and its impact on compression ratio vs memory usage.
- size_t m_dict_size{cDefaultDictionarySize}; + // Default dictionary size balancing compression ratio and memory usage + size_t m_dict_size{cDefaultDictionarySize};
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.hpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/src/clp/streaming_compression/lzma/Compressor.cpp
🔇 Additional comments (2)
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (2)
1-15
: LGTM! Header organization follows best practices.
The includes are well-organized, following the convention of grouping system headers, external dependencies, and project headers with appropriate spacing.
125-125
:
Initialize LZMA stream properly.
The m_compression_stream
member variable should be initialized with LZMA_STREAM_INIT
to ensure proper initialization of all fields.
- lzma_stream m_compression_stream;
+ lzma_stream m_compression_stream = LZMA_STREAM_INIT;
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (3)
27-29
: Enhance error message specificityThe error message could be more descriptive by including the specific operation that failed.
- return "streaming_compression::lzma::Compressor operation failed"; + return "LZMA compression operation failed: " + this->get_error_code_string();
48-52
: Enhance method documentationThe write method's documentation is missing parameter descriptions and throw specifications.
/** * Writes the given data to the compressor - * @param data - * @param data_length + * @param data Pointer to the data to be compressed + * @param data_length Length of the data in bytes + * @throw OperationFailed if the compression operation fails */
94-101
: Enhance error handling documentationThe documentation should specify which LZMA error codes can trigger the OperationFailed exception.
/** * Invoke lzma_code() repeatedly with LZMA_RUN until the input is exhausted * * At the end of the workflow, the last bytes of encoded data may still be buffered in the LZMA * stream and thus not immediately available at the output block buffer. * * Assumes input stream and output block buffer are both in valid states. - * @throw `OperationFailed` if LZMA returns an unexpected error value + * @throw OperationFailed with ErrorCode_Compression if LZMA returns LZMA_MEM_ERROR + * @throw OperationFailed with ErrorCode_Compression if LZMA returns LZMA_DATA_ERROR + * @throw OperationFailed with ErrorCode_Compression if LZMA returns LZMA_OPTIONS_ERROR */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.hpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/src/clp/streaming_compression/lzma/Compressor.cpp
🔇 Additional comments (3)
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (3)
32-44
: LGTM! Special member functions are well-defined
The class correctly implements the rule of five with appropriate move semantics and deleted copy operations.
125-125
:
Initialize LZMA stream properly
The LZMA stream should be initialized with LZMA_STREAM_INIT to ensure proper initialization of all fields.
- lzma_stream m_compression_stream;
+ lzma_stream m_compression_stream = LZMA_STREAM_INIT;
Likely invalid or redundant comment.
7-7
: Consider platform-specific LZMA header inclusion
The direct inclusion of <lzma.h>
might need platform-specific handling or fallback mechanisms.
Consider:
- Adding platform-specific include guards
- Using a common header that handles platform differences
- Adding CMake configuration checks for LZMA availability
@haiqi96 Fixed the comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets do a pass to try and:
- Remove excessive function body comments
- Add periods to complete sentences.
components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh
Outdated
Show resolved
Hide resolved
components/core/tools/scripts/lib_install/ubuntu-focal/install-prebuilt-packages.sh
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
Outdated
Show resolved
Hide resolved
Co-authored-by: davidlion <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (3)
components/core/CMakeLists.txt (3)
14-14
: Maintain alphabetical ordering of options.The options in the STRINGS property should be alphabetically ordered for consistency.
-set_property(CACHE GENERAL_COMPRESSOR PROPERTY STRINGS lzma passthrough zstd) +set_property(CACHE GENERAL_COMPRESSOR PROPERTY STRINGS lzma passthrough zstd)
233-233
: Add version requirement for LibLZMA.For consistency with other package requirements and to ensure compatibility, specify a minimum version requirement for LibLZMA.
-find_package(LibLZMA REQUIRED) +find_package(LibLZMA 5.2.5 REQUIRED)
231-232
: Consider implementing static linking support for LZMA.The TODO comment indicates that static linking support for LZMA is needed. This should be addressed to maintain consistency with other libraries that support static linking.
Would you like me to help implement the static linking support for LZMA or create a GitHub issue to track this task?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
components/core/CMakeLists.txt
(4 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/src/clp/streaming_compression/lzma/Compressor.cpp
🔇 Additional comments (1)
components/core/CMakeLists.txt (1)
234-235
: Fix incorrect variable name in status message.
The status message uses LIBLZMA_FOUND
instead of the package name LibLZMA
.
-if(LIBLZMA_FOUND)
- message(STATUS "Found Lzma ${LIBLZMA_VERSION_STRING}")
+if(LIBLZMA_FOUND)
+ message(STATUS "Found LibLZMA ${LIBLZMA_VERSION_STRING}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)
25-25
: Correct grammatical error in commentIn the comment on line 25, change "resets" to "reset" to agree with the plural subject "calls".
components/core/CMakeLists.txt (1)
237-237
: Use consistent library naming in status messagesFor consistency with other libraries, consider updating the status message to 'Found LibLZMA' and ensure the correct variable is used for the version string.
Apply this diff to fix the message:
- message(STATUS "Found Lzma ${LIBLZMA_VERSION_STRING}") + message(STATUS "Found LibLZMA ${LIBLZMA_VERSION_STRING}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
components/core/CMakeLists.txt
(4 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.hpp
(1 hunks)components/core/tests/test-StreamingCompression.cpp
(4 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
components/core/tests/test-StreamingCompression.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (1)
components/core/tests/test-StreamingCompression.cpp (1)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#614
File: components/core/tests/test-StreamingCompression.cpp:45-54
Timestamp: 2024-12-04T15:50:37.827Z
Learning: In `components/core/tests/test-StreamingCompression.cpp`, within the `compress` function, compressing the same data repeatedly by passing the same `src` pointer without advancing is intentional to test the compressor with the same data multiple times.
🔇 Additional comments (5)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (2)
268-271
:
Handle potential exceptions from m_compressed_stream_file_writer->write
Currently, the write operation assumes success without error handling. To enhance robustness, consider wrapping the write
call in a try-catch block to handle any exceptions that may occur, ensuring failures are properly reported and managed.
Apply this diff to handle possible write errors:
+ try {
m_compressed_stream_file_writer->write(
clp::size_checked_pointer_cast<char>(m_compressed_stream_block_buffer.data()),
cCompressedStreamBlockBufferSize - m_compression_stream.avail_out
);
+ } catch (const std::exception& e) {
+ SPDLOG_ERROR("Failed to write compressed data: {}", e.what());
+ throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
+ }
246-251
: 🛠️ Refactor suggestion
Review the handling of LZMA_BUF_ERROR
in flush_lzma
The LZMA_BUF_ERROR
return code can occur during flushing actions and doesn't always signify a fatal error. Throwing an exception here might prematurely terminate the compression process. Consider revising the error handling to properly manage LZMA_BUF_ERROR
, possibly allowing the loop to continue or exit gracefully when appropriate.
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (1)
130-130
:
Initialize m_compression_stream
with LZMA_STREAM_INIT
for proper setup
Currently, m_compression_stream
is default-initialized, which may leave its fields uninitialized and can lead to undefined behaviour. Initializing it with LZMA_STREAM_INIT
ensures that all fields are properly set.
Apply this diff to initialize m_compression_stream
correctly:
- lzma_stream m_compression_stream;
+ lzma_stream m_compression_stream = LZMA_STREAM_INIT;
Likely invalid or redundant comment.
components/core/tests/test-StreamingCompression.cpp (1)
132-135
: Clarify the LZMA test section to explain the absence of decompression
To enhance understanding, consider updating the section name and adding a comment to indicate that this is a compression-only test for LZMA.
Apply this diff to improve the test:
- SECTION("LZMA compression") {
+ SECTION("LZMA compression (compression-only test)") {
+ // Note: This is a compression-only test as decompression is not implemented yet for LZMA.
components/core/CMakeLists.txt (1)
483-485
:
Implement LZMA decompressor to complete compression support
Currently, only the LZMA compressor is implemented. To fully support LZMA compression, please implement the corresponding decompressor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (2)
111-114
: Consider handling LZMA_PROG_ERROR separately.LZMA_PROG_ERROR indicates programming errors that should never occur in production. Consider adding additional logging or assertions for debugging purposes.
196-219
: Consider refactoring encode_lzma for better maintainability.The encode_lzma method could be split into smaller, more focused functions to improve readability and maintainability. Consider extracting the error handling logic into a separate function.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
🔇 Additional comments (3)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (3)
97-98
: 🛠️ Refactor suggestion
Initialize the error message string to prevent potential undefined behavior.
The message pointer should be initialized with a default value.
- char const* msg{nullptr};
+ char const* msg{"Unknown error"};
Likely invalid or redundant comment.
264-273
:
Add error handling for write operations.
The write operation to the file writer should be wrapped in a try-catch block to handle potential I/O errors.
auto Compressor::flush_stream_output_block_buffer() -> void {
if (cCompressedStreamBlockBufferSize == m_compression_stream.avail_out) {
return;
}
+ try {
m_compressed_stream_file_writer->write(
clp::size_checked_pointer_cast<char>(m_compressed_stream_block_buffer.data()),
cCompressedStreamBlockBufferSize - m_compression_stream.avail_out
);
+ } catch (const std::exception& e) {
+ SPDLOG_ERROR("Failed to write compressed data: {}", e.what());
+ throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
+ }
attach_stream_output_buffer(&m_compression_stream, m_compressed_stream_block_buffer);
}
Likely invalid or redundant comment.
206-211
:
Enhance LZMA_BUF_ERROR handling to consider both input and output buffer states.
The current error handling for LZMA_BUF_ERROR should check both input and output buffer availability to accurately identify corruption.
case LZMA_BUF_ERROR:
+ if (m_compression_stream.avail_in > 0 && m_compression_stream.avail_out > 0) {
SPDLOG_ERROR(
"LZMA compressor input stream is corrupt. No encoding progress can be made."
);
throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
+ }
+ break;
- throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
Likely invalid or redundant comment.
Compressor
implementation and LZMA dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (2)
95-99
:⚠️ Potential issueHandle
LZMA_BUF_ERROR
appropriately to prevent infinite loops.In
encode_lzma()
, whenlzma_code()
returnsLZMA_BUF_ERROR
, check bothavail_in
andavail_out
to decide the next action, avoiding potential infinite loops.Apply this diff to enhance error handling:
case LZMA_BUF_ERROR: - SPDLOG_ERROR( - "LZMA compressor input stream is corrupt. No encoding progress can be made." - ); - throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); + if (m_compression_stream.avail_in == 0 && m_compression_stream.avail_out > 0) { + // No more input data; exit the loop + break; + } else if (m_compression_stream.avail_out == 0) { + // Output buffer is full; flush it + flush_stream_output_block_buffer(); + } else { + SPDLOG_ERROR("LZMA_BUF_ERROR encountered. No progress can be made."); + throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); + }
151-160
:⚠️ Potential issueHandle exceptions from the write operation.
In
flush_stream_output_block_buffer()
, wrap thewrite
call in a try-catch block to handle any potential exceptions from the file writer.Apply this diff to add exception handling:
auto Compressor::flush_stream_output_block_buffer() -> void { if (cCompressedStreamBlockBufferSize == m_compression_stream.avail_out) { return; } + try { m_compressed_stream_file_writer->write( clp::size_checked_pointer_cast<char>(m_compressed_stream_block_buffer.data()), cCompressedStreamBlockBufferSize - m_compression_stream.avail_out ); + } catch (const std::exception& e) { + SPDLOG_ERROR("Failed to write compressed data: {}", e.what()); + throw OperationFailed(ErrorCode_WriteFailed, __FILENAME__, __LINE__); + } m_lzma_ops.attach_output_buffer(); }
🧹 Nitpick comments (8)
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (6)
23-24
: Provide a more descriptive error message for unsupported compression levels.When the compression level is out of the supported range, consider including the valid range in the exception message to guide the user.
Apply this diff to enhance the error message:
- throw OperationFailed(ErrorCode_Unsupported, __FILENAME__, __LINE__); + SPDLOG_ERROR("Unsupported compression level {}. Valid range is {} to {}.", compression_level, cMinCompressionLevel, cMaxCompressionLevel); + throw OperationFailed(ErrorCode_Unsupported, __FILENAME__, __LINE__);
47-51
: Document all parameters in the method description.The
open
method's documentation lacks a description for thecompression_level
parameter. Including it enhances clarity for users of the API.Apply this diff to add the missing parameter description:
/** * Initializes the compression stream with the given compression level. * * @param file_writer The file writer to output compressed data. + * @param compression_level Compression level for the LZMA encoder (valid range is cMinCompressionLevel to cMaxCompressionLevel). */
82-89
: Group overloaded methods together for better readability.The
open
methods are overloaded but are not placed consecutively. Consider moving the defaultopen
method to be adjacent to its overloaded counterpart.
134-135
: Provide context for the buffer size constant.The
cCompressedStreamBlockBufferSize
is set to4096
. Adding a comment to explain the choice of this buffer size can help maintainability.Apply this diff to add explanatory comments:
-static constexpr size_t cCompressedStreamBlockBufferSize{4096}; // 4KiB +/** + * Size of the compressed stream block buffer. Chosen as 4KiB to balance memory usage and I/O performance. + */ +static constexpr size_t cCompressedStreamBlockBufferSize{4096}; // 4KiB
143-144
: Include exception details in method documentation.The
encode_lzma
method may throwOperationFailed
. Update the documentation to reflect this for better clarity.Apply this diff to add the exception information:
/** * Invokes lzma_code() repeatedly with LZMA_RUN until the input is exhausted. * + * @throws OperationFailed if lzma_code() returns an unexpected error. */
157-158
: Use consistent naming conventions for parameters.In the method parameters, use consistent naming conventions, such as
data_length
instead of mixinglength
andsize
.Apply this diff to rename the parameter:
- auto write(char const* data, size_t data_length) -> void override; + auto write(char const* data, size_t length) -> void override;And update all references to
data_length
in the method implementations.components/core/src/clp/streaming_compression/lzma/Compressor.cpp (2)
125-146
: Simplify the flushing logic inflush_lzma()
.The variable
flushed
may not be necessary since the loop can be controlled by the return value oflzma_code()
. This can improve readability.Apply this diff to simplify the loop:
-bool flushed{false}; -while (false == flushed) { +while (true) { if (0 == m_compression_stream.avail_out) { flush_stream_output_block_buffer(); } auto const rc = lzma_code(&m_compression_stream, flush_action); switch (rc) { case LZMA_OK: break; case LZMA_STREAM_END: // Flushing completed - flushed = true; + flush_stream_output_block_buffer(); + return; case LZMA_BUF_ERROR: SPDLOG_ERROR( "LZMA compressor input stream is corrupt. No encoding progress can be made." ); throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); default: SPDLOG_ERROR( "lzma_code() returned an unexpected value - {}.", static_cast<int>(rc) ); throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); } } -flush_stream_output_block_buffer();
202-221
: Improve clarity in error handling by reducing code duplication.In
init_lzma_encoder()
, the error handling code can be simplified by mapping error codes to messages using a map or function, reducing the switch-case boilerplate.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp
(1 hunks)components/core/src/clp/streaming_compression/lzma/Compressor.hpp
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp/streaming_compression/lzma/Compressor.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
* Forces all the encoded data buffered by LZMA to be available at output | ||
*/ | ||
auto flush() -> void override; | ||
|
||
/** | ||
* Tries to get the current position of the write head | ||
* @param pos Position of the write head | ||
* @return ErrorCode_NotInit if the compressor is not open | ||
* @return ErrorCode_Success on success | ||
*/ | ||
auto try_get_pos(size_t& pos) const -> ErrorCode override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Ensure consistency between method documentation and implementation.
The try_get_pos
method is documented to return ErrorCode
, but the implementation throws exceptions. Consider aligning the implementation with the documentation or vice versa.
Apply this diff to modify the method signature and implementation:
- auto try_get_pos(size_t& pos) const -> ErrorCode override;
+ auto try_get_pos(size_t& pos) const -> void override;
Update the implementation to throw OperationFailed
when not initialized, and adjust the documentation accordingly.
Committable suggestion skipped: line range outside the PR's diff.
m_p->m_compression_stream.avail_out = m_p->m_compressed_stream_block_buffer.size(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check pointers before dereferencing to prevent null pointer exceptions.
In attach_output_buffer()
, ensure that m_p
and the buffer pointers are valid before use.
Apply this diff to add pointer checks:
auto Compressor::LzmaStreamOperations::attach_output_buffer() -> void {
+ if (nullptr == m_p || nullptr == m_p->m_compression_stream.next_out) {
+ throw OperationFailed(ErrorCode_NotInit, __FILENAME__, __LINE__);
+ }
m_p->m_compression_stream.next_out = m_p->m_compressed_stream_block_buffer.data();
m_p->m_compression_stream.avail_out = m_p->m_compressed_stream_block_buffer.size();
}
Committable suggestion skipped: line range outside the PR's diff.
if (m_compression_stream.avail_in > 0) { | ||
SPDLOG_ERROR("Tried to close LZMA compressor with unprocessed input data."); | ||
throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Process any remaining input data before closing the compressor.
In close()
, if there is unprocessed input data, consider processing it rather than throwing an exception to ensure all data is compressed.
Apply this diff to process remaining data:
if (m_compression_stream.avail_in > 0) {
- SPDLOG_ERROR("Tried to close LZMA compressor with unprocessed input data.");
- throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__);
+ SPDLOG_WARN("Closing compressor with unprocessed input data. Processing remaining data.");
+ encode_lzma();
}
Committable suggestion skipped: line range outside the PR's diff.
Description
test-StreamingCompression
for conciseness. #599Validation performed
Updated
test-StreamingCompression.cpp
to include LZMA compressors. Tested that compressing workflow finishes without errors.Summary by CodeRabbit
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
liblzma
andliblzma-dev
packages.Tests
StreamingCompression
test suite to include LZMA compression tests.Chores
liblzma
library.xz-devel
package in CentOS prebuilt packages script.xz
package to the Homebrew installation script for macOS.