Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated initialization process #4

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from
Open
193 changes: 40 additions & 153 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,16 +59,19 @@ DynamoDB Adapter currently supports the following DynamoDB data types

## Configuration

DynamoDB Adapter requires two tables to store metadata and configuration for
the project: `dynamodb_adapter_table_ddl` and
`dynamodb_adapter_config_manager`. There are also three configuration files
required by the adapter: `config.json`, `spanner.json`, `tables.json`.
### config.yaml
This file defines the necessary settings for the adapter. A sample configuration might look like this:

By default there are two folders **production** and **staging** in
[config-files](./config-files). This is configurable by using the enviroment
variable `ACTIVE_ENV` and can be set to other environment names, so long as
there is a matching directory in the `config-files` directory. If `ACTIVE_ENV`
is not set the default environtment is **staging**.

spanner:
project_id: "my-project-id"
instance_id: "my-instance-id"
database_name: "my-database-name"

The fields are:
project_id: The Google Cloud project ID.
instance_id: The Spanner instance ID.
database_name: The database name in Spanner.

### dynamodb_adapter_table_ddl

Expand All @@ -79,121 +82,43 @@ present in DynamoDB. This mapping is required because DynamoDB supports the
special characters in column names while Cloud Spanner only supports
underscores(_). For more: [Spanner Naming Conventions](https://cloud.google.com/spanner/docs/data-definition-language#naming_conventions)

```sql
CREATE TABLE
dynamodb_adapter_table_ddl
(
column STRING(MAX),
tableName STRING(MAX),
dataType STRING(MAX),
originalColumn STRING(MAX),
) PRIMARY KEY (tableName, column)
```

![dynamodb_adapter_table_ddl sample data](images/config_spanner.png)

### dynamodb_adapter_config_manager

`dynamodb_adapter_config_manager` contains the Pub/Sub configuration used for
DynamoDB Stream compatability. It is used to do some additional operation
required on the change of data in tables. It can trigger New and Old data on
given Pub/Sub topic.

```sql
CREATE TABLE
dynamodb_adapter_config_manager
(
tableName STRING(MAX),
config STRING(MAX),
cronTime STRING(MAX),
enabledStream STRING(MAX),
pubsubTopic STRING(MAX),
uniqueValue STRING(MAX),
) PRIMARY KEY (tableName)
```

### config-files/{env}/config.json
### Initialization Modes
DynamoDB Adapter supports two modes of initialization:

`config.json` contains the basic settings for DynamoDB Adapter; GCP Project,
Cloud Spanner Database and query record limit.
#### Dry Run Mode
This mode generates the Spanner queries required to:

| Key | Description
| ----------------- | -----------
| GoogleProjectID | Your Google Project ID
| SpannerDb | Your Spanner Database Name
| QueryLimit | Default limit for the number of records returned in query
Create the dynamodb_adapter_table_ddl table in Spanner.
Insert metadata for all DynamoDB tables into dynamodb_adapter_table_ddl.
These queries are printed to the console without executing them on Spanner, allowing you to review them before making changes.

For example:

```json
{
"GoogleProjectID" : "first-project",
"SpannerDb" : "test-db",
"QueryLimit" : 5000
}
```sh
go run init.go --dry_run
```

### config-files/{env}/spanner.json

`spanner.json` is a key/value mapping file for table names with a Cloud Spanner
instance ids. This enables the adapter to query data for a particular table on
different Cloud Spanner instances.
#### Execution Mode
This mode executes the Spanner queries generated during the dry run on the Spanner instance. It will:

For example:
Create the dynamodb_adapter_table_ddl table in Spanner if it does not exist.
Insert metadata for all DynamoDB tables into the dynamodb_adapter_table_ddl table.

```json
{
"dynamodb_adapter_table_ddl": "spanner-2 ",
"dynamodb_adapter_config_manager": "spanner-2",
"tableName1": "spanner-1",
"tableName2": "spanner-1"
...
...
}
```sh
go run init.go
```

### config-files/{env}/tables.json

`tables.json` contains the description of the tables as they appear in
DynamoDB. This includes all table's primary key, columns and index information.
This file supports the update and query operations by providing the primary
key, sort key and any other indexes present.

| Key | Description
| ----------------- | -----------
| tableName | Name of the table in DynamoDB
| partitionKey | Primary key of the table in DynamoDB
| sortKey | Sorting key of the table in DynamoDB
| attributeTypes | Key/Value list of column names and type
| indices | Collection of index objects that represent the indexes present in the DynamoDB table

For example:

```json
{
"tableName": {
"partitionKey": "primary key or Partition key",
"sortKey": "sorting key of dynamoDB adapter",
"attributeTypes": {
"column_a": "N",
"column_b": "S",
"column_of_bytes": "B",
"my_boolean_column": "BOOL"
},
"indices": {
"indexName1": {
"sortKey": "sort key for indexName1",
"partitionKey": "partition key for indexName1"
},
"another_index": {
"sortKey": "sort key for another_index",
"partitionKey": "partition key for another_index"
}
}
},
.....
.....
}
### Prerequisites for Initialization
AWS CLI:
Configure AWS credentials:
```sh
aws configure set aws_access_key_id YOUR_ACCESS_KEY
aws configure set aws_secret_access_key YOUR_SECRET_KEY
aws configure set default.region YOUR_REGION
```
Google Cloud CLI:
Authenticate and set up your environment:
```sh
gcloud auth application-default login
gcloud config set project [MY_PROJECT_NAME]
```

## Starting DynamoDB Adapter
Expand All @@ -219,49 +144,11 @@ gcloud config set project [MY_PROJECT NAME]
```

```sh
export ACTIVE_ENV=PRODUCTION
go run main.go
```

### Internal Startup Stages

When DynamoDB Adapter starts up the following steps are performed:

* Stage 1 - Configuration is loaded according the Environment Variable
*ACTIVE_ENV*
* Stage 2 - Connections to Cloud Spanner instances are initialized.
Connections to all the instances are started it doesn't need to start the
connection again and again for every request.
* Stage 3 - `dynamodb_adapter_table_ddl` is parsed and will stored in ram for
faster access of data.
* Stage 4 - `dynamodb_adapter_config_manager` is loaded into ram. The adapter
will check every 1 min if configuration has been changed, if data is changed
it will be updated in memory.
* Stage 5 - Start the API listener to accept DynamoDB operations.

## Advanced

### Embedding the Configuration

The rice-box package can be used to increase preformance by converting the
configuration files into Golang source code and there by compiling them into
the binary. If they are not found in the binary rice-box will look to the
disk for the configuration files.

#### Install rice package

This package is required to load the config files. This is required in the
first step of the running DynamoDB Adapter.

Follow the [link](https://github.com/GeertJohan/go.rice#installation).

#### run command for creating the file

This is required to increase the performance when any config file is changed
so that configuration files can be loaded directly from go file.

```sh
rice embed-go
```

## API Documentation
Expand Down
4 changes: 2 additions & 2 deletions api/v1/db.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ func RouteRequest(c *gin.Context) {
case "UpdateItem":
Update(c)
default:
c.JSON(errors.New("ValidationException", "Invalid X-Amz-Target header value of" + amzTarget).
c.JSON(errors.New("ValidationException", "Invalid X-Amz-Target header value of"+amzTarget).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space around + as well as after of

Copy link
Collaborator Author

@nikitajain1998 nikitajain1998 Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that spaces around + are being removed automatically when the code is saved due to linter. Added space after of.

HTTPResponse("X-Amz-Target Header not supported"))
}
}
Expand Down Expand Up @@ -186,7 +186,7 @@ func queryResponse(query models.Query, c *gin.Context) {
}

if query.Limit == 0 {
query.Limit = config.ConfigurationMap.QueryLimit
query.Limit = models.GlobalConfig.Spanner.QueryLimit
}
query.ExpressionAttributeNames = ChangeColumnToSpannerExpressionName(query.TableName, query.ExpressionAttributeNames)
query = ReplaceHashRangeExpr(query)
Expand Down
Loading