S3toDDB

CSV file upload to a S3 bucket and store the data into target Collection and Archive DynamoDB tables

This Lambda function is for metadata ingesetion for the VTDLP Access Website. It supports collection and archive(item) metadata.

Before you deploy this Lambda function, you should already have VTDLP Access Website deployed ready, and ID Minting service and Resolution service deployed ready via DLPservices.

The DYNOCollectionTABLE, DYNOArchiveTABLE, and DYNOCollectionmapTABLE table name information are from the DynamoDB after VTDLP Access Website is deployed. See detailss here
The LongURLPath should be your VTDLP Access Website's URL. E.g. https://xxxx.yyyy.amplifyapp.com/ or your custom domain name https://iawa.lib.vt.edu/. Note: The URL should end with a slash /.
The ShortURLPath should be your Resolution service's URL. E.g. https://xxxx.execute-api.us-east-1.amazonaws.com/Prod/ or your custom domain name http://idn.lib.vt.edu/. The Resolution service API Gateway name should look something like Resolution Service API-tablename. Note: The URL should end with a slash /.
The APIKey and APIEndpoint information are from the API gateway after ID Minting service is deployed. The ID Minting service API Gateway name should look something like Mint Service API-tablename. Note: The APIEndpoint URL should end with a slash /.
The APPIMGROOTPATH is a URL point to your Cloudfront URL which serves the static images. E.g. https://img.cloud.lib.vt.edu/iawa/. Note: The URL should end with a slash /.

You can use two different methods to deploy VTDLP Services. The first method is using CloudFormation stack and the second method is using SAM CLI.

Deploy VTDLP S3toDDB Lambda function using CloudFormation stack

Step 1: Launch CloudFormation stack

Click Next to continue

Step 2: Specify stack details

Stack name: Stack name can include letters (A-Z and a-z), numbers (0-9), and dashes (-).

Parameters: Parameters are defined in your template and allow you to input custom values when you create or update a stack.

Name	Description	Note
APPIMGROOTPATH	Cloudfront URL which serves the static images. E.g. https://img.cloud.lib.vt.edu/iawa/	Required
CollectionCategory	The `VTDLP Access Website` site ID. e.g. `IAWA`	Required
DYNOCollectionTABLE	collectiontablename	Required
DYNOArchiveTABLE	archivetablename	Required
DYNOCollectionmapTABLE	collectionmaptablename	Required
NOIDNAA	The character string equivalent for the NAAN; for example, 13960 corresponds to the NAA, "archive.org"	Required
NOIDScheme	ARK (Archival Resource Key) identifier scheme that the noid utility was partly created to support. E.g. `ark:/`	Required
REGION	a valid AWS region. e.g. us-east-1	Required
LongURLPath	https://iawa.lib.vt.edu/	Required
ShortURLPath	http://idn.lib.vt.edu/	Required
APIKey	APIKEY	Required
APIEndpoint	https://xxxx.execute-api.us-east-1.amazonaws.com/Prod/	Required
S3BucketName	An Amazon S3 bucket name for you to upload the metadata CSV file. This S3 bucket is not the same as `BUCKETNAME` and can not be an existing S3 bucket.	Required
NoidLayerArn	A Lambda layer Arn. The value must be `arn:aws:lambda:us-east-1:909117335741:layer:noid-layer:6`. It is also the default value.	Required

Step 3: Configure stack options

Leave it as is and click Next

Step 4: Review

Make sure all checkboxes under Capabilities section are CHECKED

Click Create stack

Deploy VTDLP S3toDDB Lambda function using SAM CLI (For advanced users)

To use the SAM CLI, you need the following tools.

SAM CLI - Install the SAM CLI
Python 3 installed
Docker - Install Docker community edition

To build and deploy your application for the first time, run the following in your shell:

sam build --use-container

Above command will build the source of the application. The SAM CLI installs dependencies defined in requirements.txt, creates a deployment package, and saves it in the .aws-sam/build folder.

To package the application, run the following in your shell:

sam package --output-template-file packaged.yaml --s3-bucket BUCKETNAME

Above command will package the application and upload it to the S3 bucket you specified.

Run the following in your shell to deploy the application to AWS:

sam deploy --template-file packaged.yaml --stack-name STACKNAME --s3-bucket BUCKETNAME --parameter-overrides 'APPIMGROOTPATH=https://yourURL/ CollectionCategory=collection type DYNOCollectionTABLE=CollectionTableName DYNOArchiveTABLE=ArchiveTableName DYNOCollectionmapTABLE=CollectionmapTableName NOIDNAA=53696 NOIDScheme=ark:/ REGION=us-east-1 S3BucketName=S3BucketName LongURLPath=LongURLPath ShortURLPath=ShortURLPath APIKey=APIKey APIEndpoint=APIEndpoint' --capabilities CAPABILITY_IAM --region us-east-1

The above command will package and deploy your application to AWS, with a series of prompts:

Stack Name (STACKNAME): (Required) The name of the AWS CloudFormation stack that you're deploying to. If you specify an existing stack, the command updates the stack. If you specify a new stack, the command creates it. This should be unique to your account and region, and a good starting point would be something matching your project name. Stack name can include letters (A-Z and a-z), numbers (0-9), and dashes (-).
S3 Bucket (BUCKETNAME): (Required) An Amazon S3 bucket name where this command uploads your AWS CloudFormation template. S3 bucket name is globally unique, and the namespace is shared by all AWS accounts. See Bucket naming rules. This S3 bucket should be already exist and you have the permission to upload files to it. This BUCKETNAME is a different S3 bucket, not the same S3 bucket as S3BucketName.

Parameter Overrides: A string that contains AWS CloudFormation parameter overrides encoded as key-value pairs. For example, ParameterKey=ParameterValue NSTableName=DDBTableName.

Name	Description	Note
APPIMGROOTPATH	Cloudfront URL which serves the static images. E.g. https://img.cloud.lib.vt.edu/iawa/	Required
CollectionCategory	The `VTDLP Access Website` site ID and it is case sensitive. e.g. `IAWA`	Required
DYNOCollectionTABLE	collectiontablename	Required
DYNOArchiveTABLE	archivetablename	Required
DYNOCollectionmapTABLE	collectionmaptablename	Required
NOIDNAA	The character string equivalent for the NAAN; for example, 13960 corresponds to the NAA, "archive.org"	Required
NOIDScheme	ARK (Archival Resource Key) identifier scheme that the noid utility was partly created to support. E.g. `ark:/`	Required
REGION	a valid AWS region. e.g. us-east-1	Required
LongURLPath	https://iawa.lib.vt.edu/	Required
ShortURLPath	http://idn.lib.vt.edu/	Required
APIKey	APIKEY	Required
APIEndpoint	https://xxxx.execute-api.us-east-1.amazonaws.com/Prod/	Required
S3BucketName	An Amazon S3 bucket name for you to upload the metadata CSV file. This S3 bucket is not the same as `BUCKETNAME` and can not be an existing S3 bucket.	Required
NoidLayerArn	A Lambda layer Arn. The value must be `arn:aws:lambda:us-east-1:909117335741:layer:noid-layer:6`.It is also the default value.	Required

Allow SAM CLI IAM role creation: Many AWS SAM templates, including this example, create AWS IAM roles required for the AWS Lambda function(s) included to access AWS services. By default, these are scoped down to minimum required permissions. To deploy an AWS CloudFormation stack which creates or modified IAM roles, the CAPABILITY_IAM value for capabilities must be provided. If permission isn't provided through this prompt, to deploy this example you must explicitly pass --capabilities CAPABILITY_IAM to the sam deploy command. Learn more.
AWS Region: The AWS region you want to deploy your app to.

Usage

Prepare "collection_metadata.csv" and "index.csv" for Collection and Item ingestion, respectively.
Metadata ingestion:
- For Collection ingestion: Set the filename as "collection_metadata.csv" and upload it to S3BucketName.
- For Item ingestion: Set the filename as "index.csv." and upload it to S3BucketName.
Go to DynamoDB to see the end results in Collection, Archive and Collectionmap tables.

Tests

Tests are defined in the tests folder in this project. Use PIP to install the test dependencies and run tests. You must have a env file: custom_pytest.ini.example

python -m pytest --cov=. tests/unit -v -c custom_pytest.ini

These DynamoDB tables are used for testing:

archive_test
collection_test
collectionmap_test

These files are used for testing and stored in S3 bucket: vtdlp-dev-test

new_collection_metadata.csv
single_archive_metadata.csv
SFD_index.csv

Other test files are located in tests/unit/test_data/ folder

Cleanup

To delete the sample application that you created, use the AWS CLI. Assuming you used your project name for the stack name, you can run the following:

aws cloudformation delete-stack --stack-name stackname

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.aws-sam		.aws-sam
.github		.github
metadata_files		metadata_files
noid_tools		noid_tools
results_files		results_files
src		src
tests/unit		tests/unit
.gitignore		.gitignore
custom_pytest.ini.example		custom_pytest.ini.example
lambda_function.py		lambda_function.py
lambda_local_template.txt		lambda_local_template.txt
readme.md		readme.md
requirements.txt		requirements.txt
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S3toDDB

Deploy VTDLP S3toDDB Lambda function using CloudFormation stack

Step 1: Launch CloudFormation stack

Step 2: Specify stack details

Step 3: Configure stack options

Step 4: Review

Deploy VTDLP S3toDDB Lambda function using SAM CLI (For advanced users)

Usage

Tests

Cleanup

About

Releases

Packages

Contributors 4

Languages

vt-digital-libraries-platform/dlp-ingest

Folders and files

Latest commit

History

Repository files navigation

S3toDDB

Deploy VTDLP S3toDDB Lambda function using CloudFormation stack

Step 1: Launch CloudFormation stack

Step 2: Specify stack details

Step 3: Configure stack options

Step 4: Review

Deploy VTDLP S3toDDB Lambda function using SAM CLI (For advanced users)

Usage

Tests

Cleanup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages