CSV file upload to a S3 bucket and store the data into target Collection and Archive DynamoDB tables
This Lambda function is for metadata ingesetion for the VTDLP Access Website. It supports collection and archive(item) metadata.
Before you deploy this Lambda function, you should already have VTDLP Access Website deployed ready, and ID Minting service
and Resolution service
deployed ready via DLPservices.
-
The
DYNOCollectionTABLE
,DYNOArchiveTABLE
, andDYNOCollectionmapTABLE
table name information are from the DynamoDB after VTDLP Access Website is deployed. See detailss here -
The
LongURLPath
should be yourVTDLP Access Website
's URL. E.g.https://xxxx.yyyy.amplifyapp.com/
or your custom domain namehttps://iawa.lib.vt.edu/
. Note: The URL should end with a slash/
. -
The
ShortURLPath
should be yourResolution service
's URL. E.g.https://xxxx.execute-api.us-east-1.amazonaws.com/Prod/
or your custom domain namehttp://idn.lib.vt.edu/
. TheResolution service
API Gateway name should look something likeResolution Service API-tablename
. Note: The URL should end with a slash/
. -
The
APIKey
andAPIEndpoint
information are from the API gateway after ID Minting service is deployed. TheID Minting service
API Gateway name should look something likeMint Service API-tablename
. Note: TheAPIEndpoint
URL should end with a slash/
. -
The
APPIMGROOTPATH
is a URL point to your Cloudfront URL which serves the static images. E.g. https://img.cloud.lib.vt.edu/iawa/. Note: The URL should end with a slash/
.
You can use two different methods to deploy VTDLP Services. The first method is using CloudFormation stack and the second method is using SAM CLI.
Click Next to continue
-
Stack name: Stack name can include letters (A-Z and a-z), numbers (0-9), and dashes (-).
-
Parameters: Parameters are defined in your template and allow you to input custom values when you create or update a stack.
Name Description Note APPIMGROOTPATH Cloudfront URL which serves the static images. E.g. https://img.cloud.lib.vt.edu/iawa/ Required CollectionCategory The VTDLP Access Website
site ID. e.g.IAWA
Required DYNOCollectionTABLE collectiontablename Required DYNOArchiveTABLE archivetablename Required DYNOCollectionmapTABLE collectionmaptablename Required NOIDNAA The character string equivalent for the NAAN; for example, 13960 corresponds to the NAA, "archive.org" Required NOIDScheme ARK (Archival Resource Key) identifier scheme that the noid utility was partly created to support. E.g. ark:/
Required REGION a valid AWS region. e.g. us-east-1 Required LongURLPath https://iawa.lib.vt.edu/ Required ShortURLPath http://idn.lib.vt.edu/ Required APIKey APIKEY Required APIEndpoint https://xxxx.execute-api.us-east-1.amazonaws.com/Prod/ Required S3BucketName An Amazon S3 bucket name for you to upload the metadata CSV file. This S3 bucket is not the same as BUCKETNAME
and can not be an existing S3 bucket.Required NoidLayerArn A Lambda layer Arn. The value must be arn:aws:lambda:us-east-1:909117335741:layer:noid-layer:6
. It is also the default value.Required
Leave it as is and click Next
Make sure all checkboxes under Capabilities section are CHECKED
Click Create stack
To use the SAM CLI, you need the following tools.
- SAM CLI - Install the SAM CLI
- Python 3 installed
- Docker - Install Docker community edition
To build and deploy your application for the first time, run the following in your shell:
sam build --use-container
Above command will build the source of the application. The SAM CLI installs dependencies defined in requirements.txt
, creates a deployment package, and saves it in the .aws-sam/build
folder.
To package the application, run the following in your shell:
sam package --output-template-file packaged.yaml --s3-bucket BUCKETNAME
Above command will package the application and upload it to the S3 bucket you specified.
Run the following in your shell to deploy the application to AWS:
sam deploy --template-file packaged.yaml --stack-name STACKNAME --s3-bucket BUCKETNAME --parameter-overrides 'APPIMGROOTPATH=https://yourURL/ CollectionCategory=collection type DYNOCollectionTABLE=CollectionTableName DYNOArchiveTABLE=ArchiveTableName DYNOCollectionmapTABLE=CollectionmapTableName NOIDNAA=53696 NOIDScheme=ark:/ REGION=us-east-1 S3BucketName=S3BucketName LongURLPath=LongURLPath ShortURLPath=ShortURLPath APIKey=APIKey APIEndpoint=APIEndpoint' --capabilities CAPABILITY_IAM --region us-east-1
The above command will package and deploy your application to AWS, with a series of prompts:
-
Stack Name (STACKNAME): (Required) The name of the AWS CloudFormation stack that you're deploying to. If you specify an existing stack, the command updates the stack. If you specify a new stack, the command creates it. This should be unique to your account and region, and a good starting point would be something matching your project name. Stack name can include letters (A-Z and a-z), numbers (0-9), and dashes (-).
-
S3 Bucket (BUCKETNAME): (Required) An Amazon S3 bucket name where this command uploads your AWS CloudFormation template. S3 bucket name is globally unique, and the namespace is shared by all AWS accounts. See Bucket naming rules. This S3 bucket should be already exist and you have the permission to upload files to it. This
BUCKETNAME
is a different S3 bucket, not the same S3 bucket asS3BucketName
. -
Parameter Overrides: A string that contains AWS CloudFormation parameter overrides encoded as key-value pairs. For example, ParameterKey=ParameterValue NSTableName=DDBTableName.
Name Description Note APPIMGROOTPATH Cloudfront URL which serves the static images. E.g. https://img.cloud.lib.vt.edu/iawa/ Required CollectionCategory The VTDLP Access Website
site ID and it is case sensitive. e.g.IAWA
Required DYNOCollectionTABLE collectiontablename Required DYNOArchiveTABLE archivetablename Required DYNOCollectionmapTABLE collectionmaptablename Required NOIDNAA The character string equivalent for the NAAN; for example, 13960 corresponds to the NAA, "archive.org" Required NOIDScheme ARK (Archival Resource Key) identifier scheme that the noid utility was partly created to support. E.g. ark:/
Required REGION a valid AWS region. e.g. us-east-1 Required LongURLPath https://iawa.lib.vt.edu/ Required ShortURLPath http://idn.lib.vt.edu/ Required APIKey APIKEY Required APIEndpoint https://xxxx.execute-api.us-east-1.amazonaws.com/Prod/ Required S3BucketName An Amazon S3 bucket name for you to upload the metadata CSV file. This S3 bucket is not the same as BUCKETNAME
and can not be an existing S3 bucket.Required NoidLayerArn A Lambda layer Arn. The value must be arn:aws:lambda:us-east-1:909117335741:layer:noid-layer:6
.It is also the default value.Required -
Allow SAM CLI IAM role creation: Many AWS SAM templates, including this example, create AWS IAM roles required for the AWS Lambda function(s) included to access AWS services. By default, these are scoped down to minimum required permissions. To deploy an AWS CloudFormation stack which creates or modified IAM roles, the
CAPABILITY_IAM
value forcapabilities
must be provided. If permission isn't provided through this prompt, to deploy this example you must explicitly pass--capabilities CAPABILITY_IAM
to thesam deploy
command. Learn more. -
AWS Region: The AWS region you want to deploy your app to.
- Prepare "collection_metadata.csv" and "index.csv" for Collection and Item ingestion, respectively.
- Metadata ingestion:
- For Collection ingestion: Set the filename as "collection_metadata.csv" and upload it to
S3BucketName
. - For Item ingestion: Set the filename as "index.csv." and upload it to
S3BucketName
.
- For Collection ingestion: Set the filename as "collection_metadata.csv" and upload it to
- Go to DynamoDB to see the end results in Collection, Archive and Collectionmap tables.
Tests are defined in the tests
folder in this project. Use PIP to install the test dependencies and run tests. You
must have a env file: custom_pytest.ini.example
python -m pytest --cov=. tests/unit -v -c custom_pytest.ini
These DynamoDB tables are used for testing:
archive_test
collection_test
collectionmap_test
These files are used for testing and stored in S3 bucket: vtdlp-dev-test
new_collection_metadata.csv
single_archive_metadata.csv
SFD_index.csv
Other test files are located in tests/unit/test_data/ folder
To delete the sample application that you created, use the AWS CLI. Assuming you used your project name for the stack name, you can run the following:
aws cloudformation delete-stack --stack-name stackname