Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support WAL Compression #12476

Merged
merged 32 commits into from
Jun 18, 2024
Merged

Support WAL Compression #12476

merged 32 commits into from
Jun 18, 2024

Conversation

THUMarkLau
Copy link
Contributor

@THUMarkLau THUMarkLau commented May 7, 2024

The write-ahead logging consumes a significant amount of I/O resources during data writing, sometimes even 2-7 times the amount of TsFile I/O. Currently, the write-ahead logging in the system is uncompressed, this is a main reason for the large volume of WAL. Therefore, compressing the write-ahead logging can save I/O resources and enhance system performance. This PR primarily introduces support for compressing write-ahead logging, featuring the following characteristics:

  1. The basic unit for compression and decompression is a WAL Buffer, which may contain several WAL Entries, with a default size of 16MB. This design achieves a good compression ratio while minimizing modifications to upper-layer applications.

  2. To maintain compatibility with previous versions, we have replaced the original Magic String in the WAL file with a Version String to distinguish between different versions of WAL. If there are subsequent new versions of WAL, we can continue to add new Version Strings.

  3. Multiple compression methods are supported. Currently, the choice of compression method is not open to users, but theoretically, our design supports various compression methods, and we may open this option in the future.

Changes in system resource consumption and system performance caused by this feature will be supplemented in subsequent comments.

@THUMarkLau THUMarkLau force-pushed the wal-compress-formal-branch branch from 834e61d to 70d8400 Compare May 11, 2024 08:24
@THUMarkLau THUMarkLau force-pushed the wal-compress-formal-branch branch 2 times, most recently from 89b0f35 to 94152b2 Compare May 24, 2024 06:04
Copy link

codecov bot commented May 25, 2024

Codecov Report

Attention: Patch coverage is 82.53968% with 55 lines in your changes missing coverage. Please review.

Project coverage is 41.80%. Comparing base (89af73d) to head (d0b76f3).

Files Patch % Lines
...torageengine/dataregion/wal/io/WALInputStream.java 79.59% 40 Missing ⚠️
...eengine/dataregion/wal/utils/WALEntryPosition.java 50.00% 7 Missing ⚠️
...java/org/apache/iotdb/db/conf/IoTDBDescriptor.java 50.00% 4 Missing ⚠️
...wal/allocation/AbstractNodeAllocationStrategy.java 0.00% 3 Missing ⚠️
...b/storageengine/dataregion/wal/io/WALMetaData.java 88.88% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #12476      +/-   ##
============================================
+ Coverage     41.75%   41.80%   +0.04%     
  Complexity       71       71              
============================================
  Files          3293     3294       +1     
  Lines        207571   207852     +281     
  Branches      24664    24708      +44     
============================================
+ Hits          86673    86888     +215     
- Misses       120898   120964      +66     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@THUMarkLau THUMarkLau force-pushed the wal-compress-formal-branch branch from e7250f8 to 069795d Compare May 26, 2024 06:25
@@ -79,6 +132,10 @@ public long size() {
return size;
}

public long oiginalSize() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be replaced with originalSize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -221,7 +222,7 @@ public class IoTDBConfig {
private int walBufferQueueCapacity = 500;

/** Size threshold of each wal file. Unit: byte */
private volatile long walFileSizeThresholdInByte = 30 * 1024 * 1024L;
private volatile long walFileSizeThresholdInByte = 30 * 1024L * 1024L;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recover.

@@ -763,6 +763,10 @@ data_replication_factor=1
# Datatype: long
# iot_consensus_cache_window_time_in_ms=-1

# WAL compression algorithm
# options: UNCOMPRESSED, SNAPPY, LZ4, GZIP, ZSTD
# wal_compress_algorithm=UNCOMPRESSED
Copy link
Collaborator

@Caideyipi Caideyipi May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May tune to decide the default algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the configuration item to enable or disable wal compression by true or false.

@THUMarkLau THUMarkLau force-pushed the wal-compress-formal-branch branch from eb167a8 to 93a6482 Compare May 31, 2024 00:35
Copy link
Contributor

@jt2594838 jt2594838 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are issues that should be solved. Also, please add new tests.

@@ -1215,6 +1215,10 @@ data_replication_factor=1
# Datatype: long
# iot_consensus_cache_window_time_in_ms=-1

# Enable Write Ahead Log compression.
# Option: true, false
# enable_wal_compression=false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable hot reload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@jt2594838 jt2594838 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, please see to the remaining minor issues.

@THUMarkLau THUMarkLau force-pushed the wal-compress-formal-branch branch from ee76763 to d0b76f3 Compare June 17, 2024 07:22
Copy link

sonarcloud bot commented Jun 17, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud

@jt2594838 jt2594838 merged commit 0bb4619 into master Jun 18, 2024
38 of 40 checks passed
@HTHou HTHou deleted the wal-compress-formal-branch branch June 18, 2024 04:51
JackieTien97 added a commit that referenced this pull request Jun 21, 2024
* [IOTDB-6336] Add max retry time duration and whether to retry for unknown errors configurations

* Optimize error message when creating unsupported data type (#12650)

Optimize error message when creating unsupported data type

* Pipe: fix uninitialized config when validate PipeHistoricalDataRegionTsFileExtractor parameters & Subscription: avoid poll messages from unsubscribed topics (#12648)

* Pipe: fix uninitialized config when validate PipeHistoricalDataRegionTsFileExtractor parameters

* Subscription: avoid poll messages from unsubscribed topics

* Fix region migration's timeout judgement #12639

* fix compaction recover skip negative time partition (#12637)

* Fix Concurrency Instability When IoTConsensus LogDispatcher Exits #12655

* JDBC: Fix IndexOutOfBoundsException when calling getBytes (#12662)

* JDBC: Fix ArrayOutOfBoundException when calling getBytes

* Update JDBCCharsetExample.java

* [IOTDB-6335] Redundant rows when using GROUP BY TIME with LIMIT

* Pipe: Enhance error handling logic in pipe async connector's handlers (#12669)

* Print log when read resource error (#12666)

* Load: Make load SQL option VERIFY can only be set to true (#12670)

* Pipe: Improved hybrid algorithm and avoid the caculations of tsfiles in connector pending queue tsfiles to block the forwarding of realtime requests (#12668)

* Migrate to Develocity Maven Extension (#12658)

* Migrate to Develocity Maven Extension
* Update to Common Custom User Data Maven Extension 2.0

* Update import schema logic implementation and optimize -help display information (#12643)

* add import-schema.sh/bat and export-schema.sh/bat in cli/tools

* fix import schema

* fix import schema

---------

Co-authored-by: 2b3c511 <[email protected]>

* Subscription: fix topic now timestamp precision (#12663)

* [IOTDB-6337] Refine the count calculation in RegionScan framework

* Bump version of master branch to 1.3.3-SNAPSHOT (#12672)

* Pipe: Reduce warn log when PipeConfigRegionExtractorMetrics does not exist (#12673)

* [IOTDB-6061] Fix the instability failure caused by initServer in IoTConsensus UT not binding to the corresponding port (#12674)

* fix concurrent

* fix concurrent

* Load: Make load SQL option VERIFY can only be set to true (A follow-up fix for #12670) (#12676)

* Pipe: add compression level config for connector ZSTD compressor (#12630)

Co-authored-by: Steve Yurong Su <[email protected]>

* [PY-client] Unify setup.py and requirement.txt (#12601)

* Pipe IT: Ignore failed cases caused by cluster restart failure (#12678)

* update import and export data -help description (#12677)

Co-authored-by: 2b3c511 <[email protected]>

* Pipe: Fix potential NPE when lastEvent == null in PipeConnectorSubtask (#12680)

* Fix wrong calculation for maxBytesCanReserve for IdentitySinkOperator

* Change floating point encoding method in IT

* Throw `IndexOutOfBound` exception in row implementation.

* Thread safely SeriesPartitionTable (#12679)

* finish

* use concurrentskiplist

* bug fix

---------

Co-authored-by: OneSizeFitQuorum <[email protected]>

* Pipe: Refactor some features for user access (#12686)

* Set timestamp precision same as time column (#12681)

* done

* done

* rollback change

* fix IT

* fix IT

* Set configuration on node (#12626)

Co-authored-by: Haonan <[email protected]>

* Fix cli report error when ignoreTimestamp is true (#12691)

* done

* done

* rollback change

* fix IT

* fix IT

* done

* invoke ci

* done

* invoke ci

* Load: Add check for BufferUnderflowException (#12690)

Co-authored-by: xz m <[email protected]>

* remove excess stack

* Fix new UDTF execution framework dead loop error.

* Perfect methods of IAnalysis to adapt the write process of table model

* Load: Support storing piecenode in multiple folders during 2nd phase (#12675)

Co-authored-by: Steve Yurong Su <[email protected]>

* Pipe: add decompressed length in RPC compression payload to avoid potential OOM on receiver (#12701)

* Load: Update loadTsFileDirs after all newLoadTsFileDirs are generated to avoid undefined behavior (#12712)

* Pipe: Fixed NPE in schema pattern parsing & Improved connector subtask retry logic and logs & Fixed the executor used by consensus pipe (#12704)

* Procedure: Fix ProcedureInfo on CN leader may decrease its procId (#12711)

* Fix missing of linux memory metrics (#12713)

* fix missing memory metric

* remove the print of type

* update import and export tsfile description (#12684)

Co-authored-by: 2b3c511 <[email protected]>

* Subscription: skip on setup and cluster failure when running subscription restart IT & fix some bugs in SubscriptionExecutorServiceManager (#12710)

* Update pom.xml to fix security issue (#12705)

Co-authored-by: Christofer Dutz <[email protected]>

* Rename IoTV2Consensus to IoTConsensusV2 (#12715)

Signed-off-by: OneSizeFitQuorum <[email protected]>

* Fix NumberFormatException when upgrade from old version #12719

* Replenish effective mode in iotdb-system.properties (#12706)

* load-tsfile script: disable redirection & load: handle exceptions using the Analysis objects instead of throwing exceptions & pipe: handle SYSTEM_READ_ONLY correctly on receiver side (#12716)

* Subscription: improve parsing logic when using JAVA SDK client & refactor subscription IT & intro `getSubscribedTopicNames` API (#12721)

* Pipe: Fix schema events can not report & Fix delete data events in data regions may fail to mark at schema metrics (#12722)

* Pipe: Support `"source.history.loose-range" = "path"` in iotdb-source  (#12651)

Co-authored-by: Steve Yurong Su <[email protected]>

* [IOTDB-6338] Fix wrong query result while using some value filter with LIMIT/OFFSET

* Pipe: fix threads of IoTDB-Pipe-Processor-Executor-Pool stucked by PipeTsFileInsertionEvent#waitForTsFileClose (#12727)

* [ISSUE 12499] Reject query with massive time span in limited memroy

* Fix/include thrift lib and headers in cpp client (#12734)

* fix: Hopefully fixed the build of the cpp-example on windows systems

* fix: Increase the timeout even more to finally make the build succeed.

* fix: Fix the build.

* chore: Fixed the assembly descriptor of the cpp-client to include the thrift libs and header files.

* Pipe / Load / Subscription: Support new TsDataTypes (STRING / BLOB / TIMESTAMP / DATE) (#12665)

* Optimize the log printing when multiple error occur (#12732)

* Update packege hierachy in Client-cpp readme (#12736)

* Support agg(*) query in templated align by device situation

* fix config node use IoTDBDescriptor (#12730)

* PipeConsensus: Delete WAL after dropping database & fix receiver directory recovery (#12738)

* fix wal deletion & create receiver dir

* update consensus protocol class

* merge judge

* modify directory path

* modify directory path

* Add instructions of compiling client-cpp with VS2019. (#12739)

* Subscription: adapt pipe completion signal for automatically drop subscription (#12724)

* Verify connection && Refactor async connection framework (#12667)

* PipeConsensus: add metrics and fix some bugs for pipeConsensus (#12723)

* Decrease TTL Deletion in compaction modification cache (#12687)

* decrease TTL Deletion in compaction modification cache

* Update iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/MultiTsFileDeviceIterator.java

Co-authored-by: Jiang Tian <[email protected]>

* modify MultiTsFileDeviceIterator

* fix spotless

---------

Co-authored-by: Jiang Tian <[email protected]>

* Stabilize verify connection IT  (#12745)

* done

* more !

* Fix compaction npe when set modified status (#12729)

* fix compaction npe when set modified status

* check empty page

* Subscription IT: execute jstack for potential stuck test cases (#12728)

* Pipe: Smoothed the rate in pipe's remaining time calculations (#12699)

* Pipe: Fix PipeDataRegionEventCounter not work in PipeRealtimeDataRegionExtractor (#12747)

* Fix python CI (#12754)

* Support show device ttl on path pattern (#12709)

* support show ttl on pathPattern

* fix UT

* modify details

* Support WAL Compression (#12476)

* enable wal compression

remove metrics in mem table flush task, cache hash code in partial path, use gzip to compress wal

batch update metrics

* fix bug

* fix compilation problem

* remove useless code

* recover some code

* support compression type in WAL Compress Header

* support multi version WAL

* edit configuration item

* add log for WAL size

* temp for debug

* fix bug

* remove useless log

* remove one configuration

* use compression rate to update wal disk usage

* fix ut

* fix test

* set default to uncompress

* fix wal ut

* optimize calculating of wal size

* close wal file when the origin size of wal buffer is larger than threshold

* add the size of magic string

* may be fix the bug

* fix with comment

* edit with review

* fix test

* add test for wal compression

* add hot reload

* clean the code to make it more readable

* reuse the byte buffer if possible

* Indicate the encoding of String

* Edit according to comment

* spotless

* Optimize the devicePath generation efficiency of PlanNode when deserializing (#12749)

* Pipe: Support "source.realtime.loose-range" = "path" in iotdb-source (#12751)

* Support active schema query (#12718)

* Check duplicate in start-node and stop-node scripts (#12756)

* check duplicate key in start node script

* modify stop node scripts

* use \n as line separator

* fix config node send showConfiguration to config node classCastException

* [IOTDB-6339] Optimize the time slice control of SeriesScanOperator and AlignedSeriesScanOperator

* Optimize node startup process & Normalize system.properties management (#12752)

* [IOTDB-6340] Clear Coordinator.queryExecutionMap while encountering MemoryNotEnough Exception

* [IOTDB-6342] Clear FragmentInstanceManager.dataNodeQueryContextMap while encountering MemoryNotEnough Exception (#12762)

* Refactor node registration check (#12766)

* done

* done

* Fix overlap between tsfile is not correctly marked (#12748)

* file fileOverlap is not correctly marked

* fix repair compaction may generate too large chunk

* Subscription: support topic loose range for path and time  (#12760)

* Add clean logic for FragmentInstance in case that callback is not added. (#12768)

* SystemPropertiesHandler use Files::move instead of File::renameTo (#12770)

* Files.move

* ?

* ?

* ?

* Pipe: Filter devices by pattern before reading device metadata from TsFile (#12765)

Currently we read the metadata of all devices and measurements when constructing a TsFileInsertionDataContainer. This is both time-consuming and memory-wasting if only a few devices match the pattern. This PR filters the devices by pattern before reading the metadata of devices and measurements, saving memory and I/O cost when there are many unmatched devices.

Note: this only works when TsFile metadata are not cached (due to high memory usage of TsFile metadata cache), because cached metadata can not be filtered so that pipes with arbitrary patterns can use it.

---------

Co-authored-by: Steve Yurong Su <[email protected]>

* Pipe: Adjusted some loggers of metrics to avoid unnecessary warns & Include "lastEvent" into pipe's event count metrics & Removed the "userConflict" judgment to data sync failure caused by METADATA_ERROR (#12758)

* Synchronize takeSnapshotAsync on RaftGroupID (#12767)

* SystemPropertiesHandler close reader and writer #12772

* Pipe Consensus: Fix events in connector are not in order because of parameter `'realtime-first' = 'true'` (#12773)

* Pipe: Avoid event being continuously rate-limited even its pipe is dropped (#12753)

Co-authored-by: Steve Yurong Su <[email protected]>

* Load: Add some load metrics of time cost, write point and disk throughput (#12735)

Co-authored-by: Steve Yurong Su <[email protected]>

---------

Signed-off-by: OneSizeFitQuorum <[email protected]>
Co-authored-by: Chen YZ <[email protected]>
Co-authored-by: V_Galaxy <[email protected]>
Co-authored-by: Li Yu Heng <[email protected]>
Co-authored-by: shuwenwei <[email protected]>
Co-authored-by: Xiangpeng Hu <[email protected]>
Co-authored-by: Steve Yurong Su <[email protected]>
Co-authored-by: YangCaiyin <[email protected]>
Co-authored-by: Haonan <[email protected]>
Co-authored-by: Caideyipi <[email protected]>
Co-authored-by: Clay Johnson <[email protected]>
Co-authored-by: Summer <[email protected]>
Co-authored-by: 2b3c511 <[email protected]>
Co-authored-by: Zikun Ma <[email protected]>
Co-authored-by: Liao Lanyu <[email protected]>
Co-authored-by: Zhihao Shen <[email protected]>
Co-authored-by: Yongzao <[email protected]>
Co-authored-by: OneSizeFitQuorum <[email protected]>
Co-authored-by: ppppoooo <[email protected]>
Co-authored-by: xz m <[email protected]>
Co-authored-by: CritasWang <[email protected]>
Co-authored-by: Beyyes <[email protected]>
Co-authored-by: Itami Sho <[email protected]>
Co-authored-by: ZhangHongYin <[email protected]>
Co-authored-by: Christofer Dutz <[email protected]>
Co-authored-by: liuminghui233 <[email protected]>
Co-authored-by: Jiang Tian <[email protected]>
Co-authored-by: yschengzi <[email protected]>
Co-authored-by: Peng Junzhi <[email protected]>
Co-authored-by: 周沛辰 <[email protected]>
Co-authored-by: Liu Xuxin <[email protected]>
Co-authored-by: William Song <[email protected]>
SzyWilliam pushed a commit to SzyWilliam/iotdb that referenced this pull request Nov 26, 2024
* enable wal compression

remove metrics in mem table flush task, cache hash code in partial path, use gzip to compress wal

batch update metrics

* fix bug

* fix compilation problem

* remove useless code

* recover some code

* support compression type in WAL Compress Header

* support multi version WAL

* edit configuration item

* add log for WAL size

* temp for debug

* fix bug

* remove useless log

* remove one configuration

* use compression rate to update wal disk usage

* fix ut

* fix test

* set default to uncompress

* fix wal ut

* optimize calculating of wal size

* close wal file when the origin size of wal buffer is larger than threshold

* add the size of magic string

* may be fix the bug

* fix with comment

* edit with review

* fix test

* add test for wal compression

* add hot reload

* clean the code to make it more readable

* reuse the byte buffer if possible

* Indicate the encoding of String

* Edit according to comment

* spotless
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants