Add support for buffering records on disk allowing control over upload size. #5

davemarco · 2024-07-16T20:33:04Z

Description

Plugin can now accumulate logs on disk and upload to s3 once a certain size threshold is reached. Logs are stored
using a "trash compactor" design described below. A recovery mechanism was added for abrupt crashes. On start up, the plugin can find buffered logs stored on disk and send them to s3. Lastly, I added an index as part of the object key which increments after each upload.

I wanted to include a timeout threshold in this PR (i.e. will send buffered logs to s3 after timeout even if size threshold is not reached), however, the PR is already too large. The timeout is non-trivial since we have no easy way to retake execution of Fluent Bit after a timeout assuming no new logs are sent to output plugin. I believe timeout requires the use of goroutines. I will add another PR to incorporate the timeout threshold.

Trash Compactor Disk Store

When designing the buffer, I had to decide whether to buffer logs on disk as uncompressed or compressed. I decided on compressed over uncompressed for a few reasons.

Less disk usage
More performant since we avoid the need serialize log events to disk and then deserialize prior to IR/Zstd encoding
More precise control over upload size. With uncompressed buffer, the upload size is more dependent on the compression ratio

Using a compressed buffer introduced challenges related to data recovery and the compression ratio.

A simple approach for the buffer would be to send all the events destined for one S3 upload to a streaming compressor and only close the stream when the target upload size is reached. However, the streaming compressor will keep frames/blocks open in between receipt of Fluent Bit chunks. Open frames/blocks may not be recoverable after an abrupt crash. Therefore, I decided to "compact" each chunk into its own Zstd frame. When the upload size is reached, stacks of frames are sent to S3. As a result, for the majority of runtime, logs are stored as valid Zstd and can be sent to s3 on startup. An EndofStream byte is appended to Zstd data on upload to terminate IR stream. This approach fixes data recovery issue; however, if the chunks are small, the compressor ratio will be poor.

To fix the compression ratio, I added a second uncompressed IR buffer. First, log events are converted to uncompressed IR and buffered into "bins". Uncompressed IR represents uncompressed trash in "trash compactor". Once the bin is full, the bin is then "compacted" into its own separate Zstd frame. Each bin may contain multiple Fluent Bit chunks. Adding a buffer for uncompressed IR fixes the poor compression ratio associated with "compacting" each Fluent Bit chunk.

Below is summary of control flow for disk buffer:

Fluent Bit Engine groups logs by tag and flushes them to output plugin every second
Output plugin recognizes tag, parses logs into IR and stores on a tag-specific disk IR buffer
if IR size < IR size threshold ,the IR is left in buffer and control is returned to Fluent Bit Engine
if IR size > IR size threshold, the IR in buffer is compressed into Zstd and stored in disk Zstd Buffer. The disk IR buffer is truncated. The Zstd frame is explicitly closed.
If Zstd size < upload size threshold, nothing is uploaded and control is returned to Fluent Bit Engine.
If Zstd size > upload size threshold (i.e. multiple IR bins have been compressed), the Zstd frames are sent to s3, and the Zstd buffer is truncated. After upload, control is returned to Fluent Bit Engine.

There is currently no timeout functionality as mentioned at the top of PR description. As a result, the plugin will not upload logs if log quantity < upload size threshold. A timeout is being added in next PR.

Recovery

On startup the plugin will look for IR and Zstd files in the store directory and group them based on tag. It will then compress the IR into the Zstd file and send it to s3.

Index

I added an index in the s3 object key which increments after each upload, helping prevent namespace collision.

Validation performed

Tested buffering and recovery worked as expected with test logs. Tested files sent to s3 could be archived by clp.

davemarco · 2024-07-17T16:00:20Z

Removed code to shorten PR. Specifically removed recovery code that looks for stores on startup, and sends to s3. Recovery logic will be added back for next PR.

plugins/out_clp_s3/flush/flush.go

davidlion · 2024-07-26T00:39:26Z

plugins/out_clp_s3/flush/flush.go

+	if ctx.Config.UseDiskBuffer {
+		zstdFile, ok := tag.Writer.ZstdBuffer.(*os.File)
+		if !ok {
+			return fmt.Errorf("error type assertion from buffer to file failed")
+		}
+		// Seek to start of Zstd file.
+		zstdFile.Seek(0, io.SeekStart)
 	}


I'm not sure if this logic should be in IrZstdWriter.Close, but ideally the underlying zstd and ir buffers shouldn't leak out of IrZstdWriter.

plugins/out_clp_s3/flush/flush.go

Co-authored-by: davidlion <[email protected]>

…nto disk_buf

davemarco added 18 commits July 12, 2024 18:15

add irzstd

62886f3

updated config

5e96bde

update context

ea9b8d7

update flush

16faaef

add recovery

a8d6b2c

update main

72e8fd5

refactor irzstd

c2f8bdb

refactor flush

391f662

move recovery to own package and refactor

0157ce9

more cleanup

e042d42

more cleanup

69e3747

latest

efc55b0

latest

651c732

ran golangci

2d68293

minor changes

b98951e

update README

8b82e83

small changes

b098d0b

small changes

7b6b8b4

davemarco requested a review from davidlion July 16, 2024 20:33

davemarco added 3 commits July 16, 2024 20:36

fix linting

1260105

remove recovery code to shorten PR

e0008d2

fix linting

5f2a441

davidlion added 2 commits July 25, 2024 20:29

Rename store to buffer

2a02cdb

Fix lints

7d22919

davidlion requested changes Jul 26, 2024

View reviewed changes

davidlion reviewed Jul 26, 2024

View reviewed changes

plugins/out_clp_s3/flush/flush.go Outdated Show resolved Hide resolved

davemarco and others added 3 commits July 25, 2024 21:00

Apply suggestions from code review

dad4eae

Co-authored-by: davidlion <[email protected]>

changes calls to lower case

1960bdc

make fields private in irzstd.go

55c5657

davemarco added 8 commits July 26, 2024 01:55

privatize irzstd

aa599dd

changto Get

3619170

more changes

7ea26e7

linting changes

bd37935

new changes to fix linting

821b083

more refactor

8b21043

fix linting

23bc151

refactor

11cd257

davemarco requested a review from davidlion July 26, 2024 14:21

davidlion and others added 6 commits July 26, 2024 12:05

Fix nits

aa769bc

Revert mistaken golangci.yml update.

87a419d

fixed seek error catch

333b288

Drop some comments.

f6940e7

fix ty[p

c5e5d58

Merge branch 'disk_buf' of github.com:davemarco/fluent-bit-clp-fork i…

e976c3f

…nto disk_buf

davemarco changed the title ~~Add support for buffering records to disk allowing control over upload size.~~ Add support for buffering records on disk allowing control over upload size. Jul 30, 2024

davidlion approved these changes Jul 30, 2024

View reviewed changes

davidlion merged commit 3ed144a into y-scope:main Jul 30, 2024
2 checks passed

davemarco mentioned this pull request Jul 30, 2024

Support crash recovery for events (records) buffered on disk. #6

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for buffering records on disk allowing control over upload size. #5

Add support for buffering records on disk allowing control over upload size. #5

davemarco commented Jul 16, 2024 •

edited

Loading

davemarco commented Jul 17, 2024

davidlion Jul 26, 2024

Add support for buffering records on disk allowing control over upload size. #5

Add support for buffering records on disk allowing control over upload size. #5

Conversation

davemarco commented Jul 16, 2024 • edited Loading

Description

Trash Compactor Disk Store

Recovery

Index

Validation performed

davemarco commented Jul 17, 2024

davidlion Jul 26, 2024

Choose a reason for hiding this comment

davemarco commented Jul 16, 2024 •

edited

Loading