Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Prefix: Support configuring uploaded file paths/names within a bucket #4

Closed
adamvduke opened this issue Dec 22, 2014 · 8 comments
Assignees

Comments

@adamvduke
Copy link

The current version of the plugin stores all of the uploaded files in the root of the configured bucket. It would be valuable to be able to configure the path and filename within the bucket to allow for ease of longtime archiving without having to scroll through 1000's of files when viewing the contents of the bucket. Maybe a static configuration, or a proc that can be eval'd before uploading the new file?

s3 {
  ...
  path => "{year}/{month}/{day}/{hour}/{host}{uuid}"
  path_proc => '{|time|  "#{time.year}/#{time.month}/#{time.day}/#{time.hour}/#{Socekt.gethostname}#{uuid}'
}
@ph
Copy link
Contributor

ph commented Dec 22, 2014

I would go with the static configuration (similar to the https://github.com/logstash-plugins/logstash-output-file). In the file ouput you can use the data in the event to generate the path of the file.

@pruthvintss
Copy link

Hey, is there an update on how this can be done? would like to configure the path with event fields

@bradphipps
Copy link

This would definitely be useful. Outputting to folders within in a bucket based on their type would make data archival/retention/deletion processes easier to manage/safer. Sadly the following config doesn't work: bucket => "logstash-output-bucket/%{type}"

@radupantiru
Copy link

+1

1 similar comment
@shangliuyan
Copy link

+1

@wiibaa
Copy link
Contributor

wiibaa commented May 27, 2016

S3 newbie question,
does this request means that

  1. the @Prefix config should be able to use event data like %{type}/ or `%{+YYYY}/%{+MM}/%{dd}/ ??to specify a "folder structure"
  2. the filename itself currently generated in this method should be able to use event data ?

If this is right, I could dig in further, most probably looking for 1. first
Thoughts ?

@adamvduke
Copy link
Author

It looks like there are several implementations of this feature to varying degrees. I'm currently not focused on this issue, but perhaps someone on this thread could contribute to one of the open PR's to one/more/all of them along.

#44
#59
#70
#81

@ph ph self-assigned this Aug 30, 2016
@ph ph changed the title Support configuring uploaded file paths/names within a bucket Dynamic Prefix: Support configuring uploaded file paths/names within a bucket Aug 30, 2016
ph added a commit to ph/logstash-output-s3 that referenced this issue Dec 15, 2016
**Motivation**
One of the most requested features was adding a way to add dynamic prefixes using the fieldref
syntax for the files on the bucket and also the changes in the pipeline to support shared delegator.
The S3 output by nature was always a single threaded writes but had multiples workers to process the upload, the code was threadsafe when used in the concurrency `:single` mode.

This PR addresses a few problems and provide shorter and more structured code:
- This Plugin now uses the V2 version of the SDK, this make sure we receive the latest updates and changes.
- We now uses S3's `upload_file` instead of reading chunks, this method is more efficient and will uses the multipart with threads if the files is too big.
- You can now use the `fieldref` syntax in the prefix to dynamically changes the target with the events it receives.
- The Upload queue is now a bounded list, this options is necessary to allow back pressure to be communicated back to the pipeline but its configurable by the user.
- If the queue is full the plugin will start the upload in the current thread.
- The plugin now threadsafe and support the concurrency model `shared`
- The rotation strategy can be selected, the recommended is `size_and_time` that will check for both the configured limits (`size` and `time` are also available)
- The `restore` option will now use a separate threadpool with an unbounded queue
- The `restore` option will not block the launch of logstash and will uses less resources than the real time path
- The plugin now uses `multi_receive_encode`, this will optimize the writes to the files
- rotate operation are now batched to reduce the number of IO calls.
- Empty file will not be uploaded by any rotation rotation strategy
- We now use Concurrent-Ruby for the implementation of the java executor
- If you have finer grain permission on prefixes or want faster boot, you can disable the credentials check with `validate_credentials_on_root_bucket`
- The credentials check will no longer fails if we can't delete the file
- We now have a full suite of integration test for all the defined rotation

Fixes: logstash-plugins#4 logstash-plugins#81 logstash-plugins#44 logstash-plugins#59 logstash-plugins#50
@FlorinAndrei
Copy link

The prefix parameter for the S3 output plugin now (v5.2.0 or newer) supports string interpolation based on field values.

s3 {
  prefix => "logstash-test/%{event}/"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants