Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 application settings #3418

Merged
merged 63 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
6f82ee8
Add S3/MinIO support for application settings
rmattis Sep 19, 2022
b433a07
Fix checkstyle warning
muuki88 Oct 27, 2023
017c543
Use property for version in pom.xml
muuki88 Dec 21, 2023
b93a281
Specify version for awssdk s3
muuki88 Dec 21, 2023
c163511
Adding requrireNonNull checks in S3ApplicationSettings
muuki88 Dec 21, 2023
a17952c
Recfactor lambda into method for less nesting
muuki88 Dec 21, 2023
d1c6770
Remove Option<String> as map value
muuki88 Dec 21, 2023
c2be492
Add empty lines and make withInitialSlash static
muuki88 Dec 22, 2023
58f47cb
Fix checkstyle issues
muuki88 Dec 22, 2023
5520b54
Add requireNonNull checks
muuki88 Jan 3, 2024
6ab79c9
Check that accountId and file path match
muuki88 Jan 3, 2024
183107a
Fix linting issues
muuki88 Jan 22, 2024
9ad80b5
Fix linting issue #2
muuki88 Jan 22, 2024
c9dd737
Merge branch 'master' into highfivve-github-master
muuki88 Feb 12, 2024
f02c424
Fix pom.xml
muuki88 Mar 4, 2024
79c2f94
Use Set instead of List
muuki88 Mar 4, 2024
75c7ba7
Merge remote-tracking branch 'origin/master' into highfivve-github-ma…
muuki88 Mar 4, 2024
dc0f985
Add empty lines after multi-line parameter function
muuki88 Mar 4, 2024
a18a9b7
Fix compile error in S3ApplicationSettingsTest
muuki88 Mar 4, 2024
67f047c
Remove optional
muuki88 Mar 4, 2024
5ec3976
Remove unused import
muuki88 Mar 9, 2024
c958771
Merge remote-tracking branch 'origin' into highfivve-github-master
muuki88 Mar 9, 2024
a3e350f
Use AccountPrivacyConfig.builder
muuki88 Mar 9, 2024
a0aa64a
GD-7732 handle non existing stored impressions gracefully
muuki88 Mar 25, 2024
53ce17b
GD-7732 Use SetUtils for calculating missing stored impressions
muuki88 Mar 25, 2024
5939670
GD-7732 Use atomic reference and remove timeout
muuki88 Mar 26, 2024
a085ae6
GD-7732 Use SetUtils.difference
muuki88 Mar 26, 2024
62f5096
GD-7732 Use onSuccess/onFailure instead of map/recover
muuki88 Mar 26, 2024
8132c26
GD-7732 Remove redundant Set/Stream/List conversions
muuki88 Mar 26, 2024
754e318
GD-7732 Rename aLong var to ignored
muuki88 Mar 26, 2024
1e40c25
GD-7732 getFileContents runs in parallel
muuki88 Mar 26, 2024
20cdf45
GD-7732 Use CompositeFutura.join instead of all
muuki88 Mar 26, 2024
99f499c
Merge master
muuki88 May 25, 2024
d089b27
Fix compile error in SettingsConfiguration
muuki88 May 25, 2024
f55beff
Remove unused imports
muuki88 May 26, 2024
bdc37f2
Proper initialize implementation
muuki88 May 26, 2024
7702b2c
Merge remote-tracking branch 'origin' into highfivve-github-master
muuki88 Jul 16, 2024
a4469e2
Adding region property
muuki88 Jul 16, 2024
87466a7
Migrate to Junit5 - one test case still broken
muuki88 Jul 18, 2024
9c2b22c
Use prebid logger implementation, mark vars as final and return void …
muuki88 Jul 25, 2024
6a5d1fa
Use proposed refactoring
muuki88 Jul 25, 2024
6a61e58
Use vertx.getOrCreateContext()
muuki88 Jul 25, 2024
70be3ca
Add proposed refactoring
muuki88 Jul 25, 2024
c97e0bf
Add sample config with s3
muuki88 Jul 25, 2024
7ea3745
Update aws depdendency
muuki88 Jul 25, 2024
8655f55
Remove unnecessary beans
muuki88 Jul 26, 2024
5b21d38
Remove invalidate cache logic
muuki88 Jul 26, 2024
f898648
Change private to protected to satisfy IDEA
muuki88 Jul 26, 2024
e89c0f0
Formatting
muuki88 Jul 26, 2024
1229372
remove unused imports
muuki88 Jul 29, 2024
31febb4
Remove vertx.getOrCreateContext() call
muuki88 Aug 19, 2024
cc29721
Revert "Remove vertx.getOrCreateContext() call"
muuki88 Aug 20, 2024
ea36f03
Reintroduce vertx.getOrCreateContext
muuki88 Aug 20, 2024
8f6cddd
hello checkstyle my old friend ...
muuki88 Aug 20, 2024
cda0e5a
Add force-path-style value
muuki88 Aug 22, 2024
332d5d4
Log refresh period
muuki88 Aug 22, 2024
68f2975
Remove println debugging
muuki88 Aug 22, 2024
f75d22d
Merge remote-tracking branch 'muuki/highfivve-github-master' into s3-…
CTMBNara Sep 2, 2024
c86e9cc
Refactor code and fix units.
CTMBNara Sep 2, 2024
65d28dc
Tests: S3 settings functional tests (#2837)
osulzhenko Sep 3, 2024
a6f6e52
Remove empty lines.
CTMBNara Sep 4, 2024
2c0e6ff
Wait until stored data fetched on init stage.
CTMBNara Sep 4, 2024
c209e6e
Fix unit test.
CTMBNara Sep 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions docs/application-settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,51 @@ Here's an example YAML file containing account-specific settings:
default: true
```

## Setting Account Configuration in S3

This is identical to the account configuration in a file system, with the main difference that your file system is
[AWS S3](https://aws.amazon.com/de/s3/) or any S3 compatible storage, such as [MinIO](https://min.io/).


The general idea is that you'll place all the account-specific settings in a separate YAML file and point to that file.

```yaml
settings:
s3:
accessKeyId: <S3 access key>
secretAccessKey: <S3 access key>
endpoint: <endpoint> # http://s3.storage.com
bucket: <bucket name> # prebid-application-settings
region: <region name> # if not provided AWS_GLOBAL will be used. Example value: 'eu-central-1'
accounts-dir: accounts
stored-imps-dir: stored-impressions
stored-requests-dir: stored-requests
stored-responses-dir: stored-responses

# recommended to configure an in memory cache, but this is optional
in-memory-cache:
# example settings, tailor to your needs
cache-size: 100000
ttl-seconds: 1200 # 20 minutes
# recommended to configure
s3-update:
refresh-rate: 900000 # Refresh every 15 minutes
timeout: 5000
```

### File format

We recommend using the `json` format for your account configuration. A minimal configuration may look like this.

```json
{
"id" : "979c7116-1f5a-43d4-9a87-5da3ccc4f52c",
"status" : "active"
}
```

This pairs nicely if you have a default configuration defined in your prebid server config under `settings.default-account-config`.

## Setting Account Configuration in the Database

In database approach account properties are stored in database table(s).
Expand Down
6 changes: 6 additions & 0 deletions extra/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
<protobuf.version>3.21.7</protobuf.version>
<protoc.version>3.17.3</protoc.version>
<json-logic.version>1.0.7</json-logic.version>
<aws.awssdk.version>2.26.24</aws.awssdk.version>

<!-- Project test dependency versions -->
<wiremock.version>3.9.1</wiremock.version>
Expand Down Expand Up @@ -212,6 +213,11 @@
<artifactId>geoip2</artifactId>
<version>${maxmind-client.version}</version>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<version>${aws.awssdk.version}</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
Expand Down
4 changes: 4 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,10 @@
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
</dependency>
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
Expand Down
60 changes: 60 additions & 0 deletions sample/configs/prebid-config-s3.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
status-response: "ok"

server:
enable-quickack: true
enable-reuseport: true

adapters:
appnexus:
enabled: true
ix:
enabled: true
openx:
enabled: true
pubmatic:
enabled: true
rubicon:
enabled: true
metrics:
prefix: prebid
cache:
scheme: http
host: localhost
path: /cache
query: uuid=
settings:
enforce-valid-account: false
generate-storedrequest-bidrequest-id: true
s3:
accessKeyId: prebid-server-test
secretAccessKey: nq9h6whXQURNL2NnWg3rcMlLMtGGDJeWrdl8hC9g
endpoint: http://localhost:9000
bucket: prebid-server-configs.example.com # prebid-application-settings
force-path-style: true # virtual bucketing
# region: <region name> # if not provided AWS_GLOBAL will be used. Example value: 'eu-central-1'
accounts-dir: accounts
stored-imps-dir: stored-impressions
stored-requests-dir: stored-requests
stored-responses-dir: stored-responses

in-memory-cache:
cache-size: 10000
ttl-seconds: 1200 # 20 minutes
s3-update:
refresh-rate: 900000 # Refresh every 15 minutes
timeout: 5000

gdpr:
default-value: 1
vendorlist:
v2:
cache-dir: /var/tmp/vendor2
v3:
cache-dir: /var/tmp/vendor3

admin-endpoints:
logging-changelevel:
enabled: true
path: /logging/changelevel
on-application-port: true
protected: false
227 changes: 227 additions & 0 deletions src/main/java/org/prebid/server/settings/S3ApplicationSettings.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
package org.prebid.server.settings;

import io.vertx.core.CompositeFuture;
import io.vertx.core.Future;
import io.vertx.core.Promise;
import io.vertx.core.Vertx;
import org.apache.commons.collections4.SetUtils;
import org.apache.commons.lang3.StringUtils;
import org.prebid.server.auction.model.Tuple2;
import org.prebid.server.exception.PreBidException;
import org.prebid.server.execution.Timeout;
import org.prebid.server.json.DecodeException;
import org.prebid.server.json.JacksonMapper;
import org.prebid.server.settings.model.Account;
import org.prebid.server.settings.model.StoredDataResult;
import org.prebid.server.settings.model.StoredResponseDataResult;
import software.amazon.awssdk.core.BytesWrapper;
import software.amazon.awssdk.core.async.AsyncResponseTransformer;
import software.amazon.awssdk.services.s3.S3AsyncClient;
import software.amazon.awssdk.services.s3.model.GetObjectRequest;

import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.Set;
import java.util.concurrent.TimeoutException;
import java.util.function.Supplier;
import java.util.stream.Collectors;
import java.util.stream.Stream;

/**
* Implementation of {@link ApplicationSettings}.
* <p>
* Reads an application settings from JSON file in a s3 bucket, stores and serves them in and from the memory.
* <p>
* Immediately loads stored request data from local files. These are stored in memory for low-latency reads.
* This expects each file in the directory to be named "{config_id}.json".
*/
public class S3ApplicationSettings implements ApplicationSettings {

private static final String JSON_SUFFIX = ".json";

final S3AsyncClient asyncClient;
final String bucket;
final String accountsDirectory;
final String storedImpressionsDirectory;
final String storedRequestsDirectory;
final String storedResponsesDirectory;
final JacksonMapper jacksonMapper;
final Vertx vertx;

public S3ApplicationSettings(S3AsyncClient asyncClient,
String bucket,
String accountsDirectory,
String storedImpressionsDirectory,
String storedRequestsDirectory,
String storedResponsesDirectory,
JacksonMapper jacksonMapper,
Vertx vertx) {

this.asyncClient = Objects.requireNonNull(asyncClient);
this.bucket = Objects.requireNonNull(bucket);
this.accountsDirectory = Objects.requireNonNull(accountsDirectory);
this.storedImpressionsDirectory = Objects.requireNonNull(storedImpressionsDirectory);
this.storedRequestsDirectory = Objects.requireNonNull(storedRequestsDirectory);
this.storedResponsesDirectory = Objects.requireNonNull(storedResponsesDirectory);
this.jacksonMapper = Objects.requireNonNull(jacksonMapper);
this.vertx = Objects.requireNonNull(vertx);
}

@Override
public Future<Account> getAccountById(String accountId, Timeout timeout) {
return withTimeout(() -> downloadFile(accountsDirectory + "/" + accountId + JSON_SUFFIX), timeout)
.map(fileContent -> decodeAccount(fileContent, accountId));
}

private Account decodeAccount(String fileContent, String requestedAccountId) {
if (fileContent == null) {
throw new PreBidException("Account with id %s not found".formatted(requestedAccountId));
}

final Account account;
try {
account = jacksonMapper.decodeValue(fileContent, Account.class);
} catch (DecodeException e) {
throw new PreBidException("Invalid json for account with id %s".formatted(requestedAccountId));
}

validateAccount(account, requestedAccountId);
return account;
}

private static void validateAccount(Account account, String requestedAccountId) {
final String receivedAccountId = account != null ? account.getId() : null;
if (!StringUtils.equals(receivedAccountId, requestedAccountId)) {
throw new PreBidException(
"Account with id %s does not match id %s in file".formatted(requestedAccountId, receivedAccountId));
}
}

@Override
public Future<StoredDataResult> getStoredData(String accountId,
Set<String> requestIds,
Set<String> impIds,
Timeout timeout) {

return withTimeout(
() -> Future.all(
getFileContents(storedRequestsDirectory, requestIds),
getFileContents(storedImpressionsDirectory, impIds)),
timeout)
.map(results -> buildStoredDataResult(
results.resultAt(0),
results.resultAt(1),
requestIds,
impIds));
}

private StoredDataResult buildStoredDataResult(Map<String, String> storedIdToRequest,
Map<String, String> storedIdToImp,
Set<String> requestIds,
Set<String> impIds) {

final List<String> errors = Stream.concat(
missingStoredDataIds(storedIdToImp, impIds).stream()
.map("No stored impression found for id: %s"::formatted),
missingStoredDataIds(storedIdToRequest, requestIds).stream()
.map("No stored request found for id: %s"::formatted))
.toList();

return StoredDataResult.of(storedIdToRequest, storedIdToImp, errors);
}

private Set<String> missingStoredDataIds(Map<String, String> fileContents, Set<String> responseIds) {
return SetUtils.difference(responseIds, fileContents.keySet());
}

@Override
public Future<StoredDataResult> getAmpStoredData(String accountId,
Set<String> requestIds,
Set<String> impIds,
Timeout timeout) {

return getStoredData(accountId, requestIds, Collections.emptySet(), timeout);
}

@Override
public Future<StoredDataResult> getVideoStoredData(String accountId,
Set<String> requestIds,
Set<String> impIds,
Timeout timeout) {

return getStoredData(accountId, requestIds, impIds, timeout);
}

@Override
public Future<StoredResponseDataResult> getStoredResponses(Set<String> responseIds, Timeout timeout) {
return withTimeout(() -> getFileContents(storedResponsesDirectory, responseIds), timeout)
.map(storedIdToResponse -> StoredResponseDataResult.of(
storedIdToResponse,
missingStoredDataIds(storedIdToResponse, responseIds).stream()
.map("No stored response found for id: %s"::formatted)
.toList()));
}

@Override
public Future<Map<String, String>> getCategories(String primaryAdServer, String publisher, Timeout timeout) {
return Future.succeededFuture(Collections.emptyMap());
}

private Future<Map<String, String>> getFileContents(String directory, Set<String> ids) {
return Future.join(ids.stream()
.map(impId -> downloadFile(directory + withInitialSlash(impId) + JSON_SUFFIX)
.map(fileContent -> Tuple2.of(impId, fileContent)))
.toList())
.map(CompositeFuture::<Tuple2<String, String>>list)
.map(impIdToFileContent -> impIdToFileContent.stream()
.filter(tuple -> tuple.getRight() != null)
.collect(Collectors.toMap(Tuple2::getLeft, Tuple2::getRight)));
}

/**
* When the impression id is the ad unit path it may already start with a slash and there's no need to add
* another one.
*
* @param impressionId from the bid request
* @return impression id with only a single slash at the beginning
*/
private static String withInitialSlash(String impressionId) {
return impressionId.startsWith("/") ? impressionId : "/" + impressionId;
}

private Future<String> downloadFile(String key) {
final GetObjectRequest request = GetObjectRequest.builder().bucket(bucket).key(key).build();

return Future.fromCompletionStage(
asyncClient.getObject(request, AsyncResponseTransformer.toBytes()),
vertx.getOrCreateContext())
.map(BytesWrapper::asUtf8String)
.otherwiseEmpty();
}

private <T> Future<T> withTimeout(Supplier<Future<T>> futureFactory, Timeout timeout) {
final long remainingTime = timeout.remaining();
if (remainingTime <= 0L) {
return Future.failedFuture(new TimeoutException("Timeout has been exceeded"));
}

final Promise<T> promise = Promise.promise();
final Future<T> future = futureFactory.get();

final long timerId = vertx.setTimer(remainingTime, id ->
promise.tryFail(new TimeoutException("Timeout has been exceeded")));

future.onComplete(result -> {
vertx.cancelTimer(timerId);
if (result.succeeded()) {
promise.tryComplete(result.result());
} else {
promise.tryFail(result.cause());
}
});

return promise.future();
}
}
Loading
Loading