Skip to content

Commit

Permalink
Merge pull request IQSS#8313 from poikilotherm/5733-s3-cred-chain
Browse files Browse the repository at this point in the history
5733 s3 cred chain
  • Loading branch information
kcondon authored Jan 14, 2022
2 parents a70c30b + c3a1e21 commit 6fa131f
Show file tree
Hide file tree
Showing 3 changed files with 93 additions and 24 deletions.
7 changes: 7 additions & 0 deletions doc/release-notes/5733-s3-creds-chain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Providing S3 Storage Credentials via MicroProfile Config

With this release, you may use two new options to pass an access key identifier and a secret access key for S3-based
storage definitions without creating the files used by the AWS CLI tools (`~/.aws/config` & `~/.aws/credentials`).

This has been added to ease setups using containers (Docker, Podman, Kubernetes, OpenShift) or testing and developing
installations. Find added [documentation and a word of warning in the installation guide](https://guides.dataverse.org/en/latest/installation/config.html#s3-mpconfig).
88 changes: 66 additions & 22 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,9 @@ of two methods described below:
1. Manually through creation of the credentials and config files or
2. Automatically via the AWS console commands.

Some usage scenarios might be eased without generating these files. You may also provide :ref:`static credentials via
MicroProfile Config <s3-mpconfig>`, see below.

Preparation When Using Amazon's S3 Service
##########################################

Expand Down Expand Up @@ -526,28 +529,69 @@ been tested already and what other options have been set for a successful integr

Lastly, go ahead and restart your Payara server. With Dataverse deployed and the site online, you should be able to upload datasets and data files and see the corresponding files in your S3 bucket. Within a bucket, the folder structure emulates that found in local file storage.

S3 Storage Options
##################

=========================================== ================== ========================================================================== =============
JVM Option Value Description Default value
=========================================== ================== ========================================================================== =============
dataverse.files.storage-driver-id <id> Enable <id> as the default storage driver. ``file``
dataverse.files.<id>.bucket-name <?> The bucket name. See above. (none)
dataverse.files.<id>.download-redirect ``true``/``false`` Enable direct download or proxy through Dataverse. ``false``
dataverse.files.<id>.upload-redirect ``true``/``false`` Enable direct upload of files added to a dataset to the S3 store. ``false``
dataverse.files.<id>.ingestsizelimit <size in bytes> Maximum size of directupload files that should be ingested (none)
dataverse.files.<id>.url-expiration-minutes <?> If direct uploads/downloads: time until links expire. Optional. 60
dataverse.files.<id>.min-part-size <?> Multipart direct uploads will occur for files larger than this. Optional. ``1024**3``
dataverse.files.<id>.custom-endpoint-url <?> Use custom S3 endpoint. Needs URL either with or without protocol. (none)
dataverse.files.<id>.custom-endpoint-region <?> Only used when using custom endpoint. Optional. ``dataverse``
dataverse.files.<id>.profile <?> Allows the use of AWS profiles for storage spanning multiple AWS accounts. (none)
dataverse.files.<id>.proxy-url <?> URL of a proxy protecting the S3 store. Optional. (none)
dataverse.files.<id>.path-style-access ``true``/``false`` Use path style buckets instead of subdomains. Optional. ``false``
dataverse.files.<id>.payload-signing ``true``/``false`` Enable payload signing. Optional ``false``
dataverse.files.<id>.chunked-encoding ``true``/``false`` Disable chunked encoding. Optional ``true``
dataverse.files.<id>.connection-pool-size <?> The maximum number of open connections to the S3 server ``256``
=========================================== ================== ========================================================================== =============
List of S3 Storage Options
##########################

.. table::
:align: left

=========================================== ================== ========================================================================== =============
JVM Option Value Description Default value
=========================================== ================== ========================================================================== =============
dataverse.files.storage-driver-id <id> Enable <id> as the default storage driver. ``file``
dataverse.files.<id>.type ``s3`` **Required** to mark this storage as S3 based. (none)
dataverse.files.<id>.label <?> **Required** label to be shown in the UI for this storage (none)
dataverse.files.<id>.bucket-name <?> The bucket name. See above. (none)
dataverse.files.<id>.download-redirect ``true``/``false`` Enable direct download or proxy through Dataverse. ``false``
dataverse.files.<id>.upload-redirect ``true``/``false`` Enable direct upload of files added to a dataset to the S3 store. ``false``
dataverse.files.<id>.ingestsizelimit <size in bytes> Maximum size of directupload files that should be ingested (none)
dataverse.files.<id>.url-expiration-minutes <?> If direct uploads/downloads: time until links expire. Optional. 60
dataverse.files.<id>.min-part-size <?> Multipart direct uploads will occur for files larger than this. Optional. ``1024**3``
dataverse.files.<id>.custom-endpoint-url <?> Use custom S3 endpoint. Needs URL either with or without protocol. (none)
dataverse.files.<id>.custom-endpoint-region <?> Only used when using custom endpoint. Optional. ``dataverse``
dataverse.files.<id>.profile <?> Allows the use of AWS profiles for storage spanning multiple AWS accounts. (none)
dataverse.files.<id>.proxy-url <?> URL of a proxy protecting the S3 store. Optional. (none)
dataverse.files.<id>.path-style-access ``true``/``false`` Use path style buckets instead of subdomains. Optional. ``false``
dataverse.files.<id>.payload-signing ``true``/``false`` Enable payload signing. Optional ``false``
dataverse.files.<id>.chunked-encoding ``true``/``false`` Disable chunked encoding. Optional ``true``
dataverse.files.<id>.connection-pool-size <?> The maximum number of open connections to the S3 server ``256``
=========================================== ================== ========================================================================== =============

.. table::
:align: left

=========================================== ================== ========================================================================== =============
MicroProfile Config Option Value Description Default value
=========================================== ================== ========================================================================== =============
dataverse.files.<id>.access-key <?> :ref:`Provide static access key ID. Read before use! <s3-mpconfig>` ``""``
dataverse.files.<id>.secret-key <?> :ref:`Provide static secret access key. Read before use! <s3-mpconfig>` ``""``
=========================================== ================== ========================================================================== =============


.. _s3-mpconfig:

Credentials via MicroProfile Config
###################################

Optionally, you may provide static credentials for each S3 storage using MicroProfile Config options:

- ``dataverse.files.<id>.access-key`` for this storages "access key ID"
- ``dataverse.files.<id>.secret-key`` for this storages "secret access key"

You may provide the values for these via any of the
`supported config sources <https://docs.payara.fish/community/docs/documentation/microprofile/config/README.html>`_.

**WARNING:**

*For security, do not use the sources "environment variable" or "system property" (JVM option) in a production context!*
*Rely on password alias, secrets directory or cloud based sources instead!*

**NOTE:**

1. Providing both AWS CLI profile files (as setup in first step) and static keys, credentials from ``~/.aws``
will win over configured keys when valid!
2. A non-empty ``dataverse.files.<id>.profile`` will be ignored when no credentials can be found for this profile name.
Current codebase does not make use of "named profiles" as seen for AWS CLI besides credentials.

Reported Working S3-Compatible Storage
######################################
Expand Down
22 changes: 20 additions & 2 deletions src/main/java/edu/harvard/iq/dataverse/dataaccess/S3AccessIO.java
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
import com.amazonaws.ClientConfiguration;
import com.amazonaws.HttpMethod;
import com.amazonaws.SdkClientException;
import com.amazonaws.auth.AWSCredentialsProviderChain;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import com.amazonaws.client.builder.AwsClientBuilder;
import com.amazonaws.services.s3.AmazonS3;
Expand Down Expand Up @@ -59,6 +62,8 @@
import java.util.Random;
import java.util.logging.Logger;
import org.apache.commons.io.IOUtils;
import org.eclipse.microprofile.config.Config;
import org.eclipse.microprofile.config.ConfigProvider;

import javax.json.Json;
import javax.json.JsonObjectBuilder;
Expand All @@ -77,6 +82,7 @@
*/
public class S3AccessIO<T extends DvObject> extends StorageIO<T> {

private static final Config config = ConfigProvider.getConfig();
private static final Logger logger = Logger.getLogger("edu.harvard.iq.dataverse.dataaccess.S3AccessIO");

private static HashMap<String, AmazonS3> driverClientMap = new HashMap<String,AmazonS3>();
Expand Down Expand Up @@ -1162,8 +1168,20 @@ private static AmazonS3 getClient(String driverId) {
* The default is "default" which should work when only one profile exists.
*/
String s3profile = System.getProperty("dataverse.files." + driverId + ".profile","default");

s3CB.setCredentials(new ProfileCredentialsProvider(s3profile));
ProfileCredentialsProvider profileCredentials = new ProfileCredentialsProvider(s3profile);

// Try to retrieve credentials via Microprofile Config API, too. For production use, you should not use env
// vars or system properties to provide these, but use the secrets config source provided by Payara.
AWSStaticCredentialsProvider staticCredentials = new AWSStaticCredentialsProvider(
new BasicAWSCredentials(
config.getOptionalValue("dataverse.files." + driverId + ".access-key", String.class).orElse(""),
config.getOptionalValue("dataverse.files." + driverId + ".secret-key", String.class).orElse("")
));

// Add both providers to chain - the first working provider will be used (so static credentials are the fallback)
AWSCredentialsProviderChain providerChain = new AWSCredentialsProviderChain(profileCredentials, staticCredentials);
s3CB.setCredentials(providerChain);

// let's build the client :-)
AmazonS3 client = s3CB.build();
driverClientMap.put(driverId, client);
Expand Down

0 comments on commit 6fa131f

Please sign in to comment.