Skip to content
This repository has been archived by the owner on Feb 2, 2021. It is now read-only.

duplicate bytes are sent to blob when a buffer exceeds the append block size #24

Open
avoltz opened this issue Oct 7, 2020 · 0 comments · May be fixed by #25
Open

duplicate bytes are sent to blob when a buffer exceeds the append block size #24

avoltz opened this issue Oct 7, 2020 · 0 comments · May be fixed by #25

Comments

@avoltz
Copy link
Contributor

avoltz commented Oct 7, 2020

There seems to be an off-by-one error when a buffer is sent in two append ops to storage.

To repro this, I made a large text file:

$ yes 'a' | head -n 5000000 > bytetest.txt

Then configured fluentd with this:

<source>
  @type tail
  path /var/log/bytetest
  pos_file /var/log/fluentd/bytetest.pos
  tag bytetest
  read_from_head true
  <parse>
    @type none
  </parse>
</source>
<match bytetest>
  @type azure-storage-append-blob
  azure_storage_account          <account>
  azure_storage_access_key       <key>
  azure_container                mycontainer
  auto_create_container  true
  path /
  azure_object_key_format           bytetest.log
  time_slice_format                 %Y-%m-%d/%Y-%m-%dT%H:%M:00
  <format>
      @type single_value
  </format>
  <buffer tag,time>
    @type file
    path /var/log/fluentd/azblob.bytetest
    flush_mode interval
    flush_at_shutdown false
    timekey 60 # 1 minute
    timekey_wait 60
  </buffer>
</match>

Then cat these contents to the file to get fluentd to buffer the entire thing:

$ sudo touch /var/log/bytetest
$ cat bytetest.txt | sudo tee -a /var/log/bytetest > /dev/null

The resulting file has extra bytes.

aadmin@atf5f7ce0c04c-linux-1:~$ diff -u bytetest.txt bytetest2.txt
--- bytetest.txt	2020-10-07 13:06:30.144485029 +0000
+++ bytetest2.txt	2020-10-07 13:16:54.164703237 +0000
@@ -2097150,6 +2097150,7 @@
 a
 a
 a
+
 a
 a
 a
@@ -4194301,7 +4194302,7 @@
 a
 a
 a
-a
+aa
 a
 a
 a
@avoltz avoltz changed the title when a buffer exceeds the append block size and is split, duplicate bytes are sent duplicate bytes are sent to blob when a buffer exceeds the append block size Oct 7, 2020
@avoltz avoltz linked a pull request Oct 7, 2020 that will close this issue
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant