This is a sample project that is meant to show a full end-to-end serverless application using entirely AWS Serverless services. It uses Amazon Textract to analyze a PDF file. You can preconfigure natural language queries that Textract will attempt to answer (e.g. 'What is the date of service of this invoice'). Also it will submit the document for expense document analysis, and return data about the document as a set of expense metadata documents.
This demonstrates how you can use .NET for an end to end serverless document processing solution in AWS. This README will detail the solution in full. The service can be deployed into an AWS account, and because it is self contained, can serve as an addon to an existing application.
This solution is meant to be useful in real-world scenario, in which multiple technologies, techniques, and services are used. Specifically, this document analysis tool showcases the technologies listed below.
-
- Custom runtime functions using .NET 8.0
- Observability implemented using Powertools for AWS Lambda (.NET)
- Lambda Annotations Framework to implement dependency injection, with source generation to automatically create the "Main" method.
-
- All infrastructure is expressed with AWS CDK (C#/.NET) with .NET 8.0.
-
- Data and configuration are stored in an Amazon DynamoDB table. Data access uses the .NET Object persistence model to simplify data access with POCO objects.
- The Lambda functions are orchestrated using an AWS Step Function standard workflow. Standard workflow was chosen because it supports the Wait for Callback (task token) integration pattern.
- Amazon Textract provides document analysis (standard and expense) capabilities.
- An Amazon Event Bridge rule is used to automatically trigger the workflow when a document is uploaded to an Amazon S3 bucket.
- Feedback is provided to the client application through the use of two Amazon SQS queues
This is an overview of the process. The names of the resources are generic, since each deployment will yield resources with different physical names (to avoid resource name collission). Sone design decisions are noted below, but there are alternate ways of accomplishing some of the items.
This application is self contained. We will refer to an external application that integrates with this system as the "client application". There can be more than one client application, and a client application that provides input (i.e. uploads a file) may be different than an application that responds to the output of the system.
-
A client application writes a PDF to the
InputBucket
S3 bucket.-
If the service has been configured to use natural language queries (explanation below), a subset of them can be specified using a colon separate list of query keys, supplied as a tag on the uploaded S3 object. For example:
Tag: "Queries"
Value: 'q1:q2:q3'
If no queries are supplied, then all configured queries will be used.
-
The client can also supply an that will be passed through the entire system. This will allow the correlation of an uploaded file's result with the client's system. For example:
Tag: "Id"
Value: "abc-12345"
Note: A client application must have permissions to write files to the InputBucket. A CloudFormation output is created when this is deployed,
inputBucketPolicyOutput
, that provides an example IAM policy that you can use to allow access to the bucket. -
-
An EventBridge rule triggers the Step Function.
-
The Step Function definition can be seen here. It consists of seven Lambda function integrations and two SQS integrations. Any unrecoverable errors (from any of the Lambda functions) are caught and sent to the
FailureFunction
function, which then writes a message to theFailureQueue
with details for the client. -
The EventBridge message is parsed by the
InitializeProcessing
Lambda function, which creates a record in theProcessData
DynamoDB table. It also retrieves the query text from theConfigData
DynamoDB table for use in the next step. -
In the
SubmitToTextract
Lambda function, the uploaded file is submitted to Textract for standard analysis. This step in the workflow uses theWait for Task Token
pattern; the step function will pause until restarted. -
When Textract is complete, it writes the output to the
TextractBucket
S3 bucket and sends a message to the supplied SNS Topic,TextractSuccessTopic
. The Lambda functionRestartStepFunction
then calls the SendTaskSuccess or SendTaskFailure depending on the Textract job status. -
The function
ProcessTextractQueryResults
retrieves the results from theTextractBucket
bucket, and writes all the query results to theProcessData
table. -
In the
SubmitToTextractExpense
Lambda function, the uploaded file is submitted to Textract for expense analysis. This step in the workflow uses theWait for Task Token
pattern; the step function will pause until restarted. -
When Textract is complete, step 6 is repeated, and the Step Function is restarted accordingly.
-
The function
ProcessTextractExpenseResults
retrieves the results from theTextractBucket
bucket, and writes all the query results to theProcessData
table. -
The
SuccessFunction
Lambda function formats the results from both query and expense analyses, and writes the data to theSuccessQueue
queue.
Note: A client application must have permissions access both the SuccessQueue
and FailureQueue
. A CloudFormation output is created for each when this application is deployed, failureQueueOutput
and successQueueOutput
. These provide example IAM policies that you can use to allow access to the queues.
This is a brief explanation of the solution's codebase.
/assets
- Images and diagrams
/functionss
- Lambda function source code
/infrastructure
- CDK .NET project source code
To deploy this solution, you will need the following prerequisites.
- Clone this repository
- You will need an AWS account and and IAM user with adequate permissions to deploy resources. You will need to set up a credentials profile. For the remainder of this exercise, we will assume the profile is named
my-profile
. - Install and set up the AWS CLI
- Install the .NET 8.0 SDK
- Install the latest version of the AWS CDK, and bootstrap the environment;
- Install the AWS Amazon.Lambda.Tools .NET Global CLI tools
Before you deploy the solution, you will need to build the .NET 8.0 Lambda functions. A script is included to build all the Lambda funtions, for both Windows and Linux/Mac.
Windows:
cd infrastructure
.\build.bat
Linux:
cd infrastructure
sh build.sh
The process will take several minutes. The Lambda function archives are output to the infrasturcure/function-output/
directory
To deploy the CDK application, you will need to supply several context values. They are:
environmentName
- The environment deployed to (e.g. 'dev', 'test', prod)stackName
- The name of the CloudFormation stack this will be deployed as. Note: the stack name will have the environment name as a suffix.functionDirectory
- The directory where the .NET Lambda functions archives are located. If not supplied, will default to './function-output'.resourcePrefix
- A prefix used when physically naming resources. This must be all lowercase and alphanumeric. Defaults to 'docprocessing'
You can supply these runtime context values in several ways. For the purpose of this demo, you can use a local file.
Craete a file called cdk.context.json
in the infrastructure
directory.
Populate it similarily to the following:
{
"environmentName":"dev",
"stackName":"docAnalysis",
"functionBaseDirectory":"./function-output",
"resourcePrefix":"doc"
}
Synthesize the CDK stack with:
cdk synth
Note: You can actually include the build in the synthesis step by adding the --build
switch:
cdk synth --build .\build.bat
You can then deploy the stack with the following command:
cdk deploy --profile my-profile
You can remove the infrastructure by using the following command:
cdk destroy --profile my-profile
You can also manually delete the CloudFormation stack that was originally created.
This will delete any resources created, as well as any data contained within your S3 buckets or DynamoDB tables.
These are some items that will be added at a later date to make the solution more extensible
- Create a Systems Manager Parameter that will parameterize the following items:
- The name of the tag that is used to specify queries to be applied to the document analysis (currently hardcoded to 'Queries')
- Add a configuration swith that will enable selecting to build the .NET Lambda functions with Native AOT