Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload recording feature #787

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ repos:
rev: v2.3.0
hooks:
- id: check-yaml
args: ['--unsafe']
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/psf/black
Expand Down
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,18 @@ To record browser events in Google Chrome (required by the `BrowserReplayStrateg

6. Set the `RECORD_BROWSER_EVENTS` flag to `true` in `openadapt/data/config.json`.

### Admin features
If you want to self host the app, you should run the following scripts

**recording_uploader**
- Ensure that you have valid AWS credentials added in your environment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about loading the AWS credentials from config.py?

- Run the following command to create a stack on AWS that you can upload recordings to
```bash
python -m admin.recording_uploader.deploy
```
- If the script runs successfully, you should see an API Gateway URL in the output
- Copy the URL and change the value of `RECORDING_UPLOAD_URL` in `openadapt/config.py` to the URL you copied

## Features

### State-of-the-art GUI understanding via [Segment Anything in High Quality](https://github.com/SysCV/sam-hq):
Expand Down
87 changes: 87 additions & 0 deletions admin/recording_uploader/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# recording-uploader

This project contains source code and supporting files for a serverless application that you can deploy with the SAM CLI. It includes the following files and folders.

- uploader - Code for the application's Lambda function.
- template.yaml - A template that defines the application's AWS resources.

The application uses several AWS resources, including Lambda functions and an API Gateway API. These resources are defined in the `template.yaml` file in this project. You can update the template to add AWS resources through the same deployment process that updates your application code.

If you prefer to use an integrated development environment (IDE) to build and test your application, you can use the AWS Toolkit.
The AWS Toolkit is an open source plug-in for popular IDEs that uses the SAM CLI to build and deploy serverless applications on AWS. The AWS Toolkit also adds a simplified step-through debugging experience for Lambda function code. See the following links to get started.

* [CLion](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [GoLand](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [IntelliJ](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [WebStorm](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [Rider](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [PhpStorm](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [PyCharm](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [RubyMine](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [DataGrip](https://docs.aws.amazon.com/toolkit-for-jetbrains/latest/userguide/welcome.html)
* [VS Code](https://docs.aws.amazon.com/toolkit-for-vscode/latest/userguide/welcome.html)
* [Visual Studio](https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/welcome.html)

## Deploy the application

There is a `deploy` script that creates the s3 bucket and deploys the application using the SAM CLI (included as part of the dev dependencies of this project). The bucket name is hardcoded in the script. The SAM CLI is set up to run in `guided` mode, which will prompt the user every time before deploying, in case the user wants to change the default values.


You can find your API Gateway Endpoint URL in the output values displayed after deployment.

## Use the SAM CLI to build and test locally

Build your application with the `sam build --use-container` command.

```bash
recording-uploader$ sam build --use-container
```

The SAM CLI installs dependencies defined in `uploader/requirements.txt`, creates a deployment package, and saves it in the `.aws-sam/build` folder.

Run functions locally and invoke them with the `sam local invoke` command.

```bash
recording-uploader$ sam local invoke RecordingUploadFunction
```

The SAM CLI can also emulate your application's API. Use the `sam local start-api` to run the API locally on port 3000.

```bash
recording-uploader$ sam local start-api
recording-uploader$ curl http://localhost:3000/
```

The SAM CLI reads the application template to determine the API's routes and the functions that they invoke. The `Events` property on each function's definition includes the route and method for each path.

```yaml
Events:
RecordingUpload:
Type: Api
Properties:
Path: /upload
Method: get
```

## Add a resource to your application
The application template uses AWS Serverless Application Model (AWS SAM) to define application resources. AWS SAM is an extension of AWS CloudFormation with a simpler syntax for configuring common serverless application resources such as functions, triggers, and APIs. For resources not included in [the SAM specification](https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md), you can use standard [AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-template-resource-type-ref.html) resource types.

## Fetch, tail, and filter Lambda function logs

To simplify troubleshooting, SAM CLI has a command called `sam logs`. `sam logs` lets you fetch logs generated by your deployed Lambda function from the command line. In addition to printing the logs on the terminal, this command has several nifty features to help you quickly find the bug.

`NOTE`: This command works for all AWS Lambda functions; not just the ones you deploy using SAM.

```bash
recording-uploader$ sam logs -n RecordingUploadFunction --stack-name "recording-uploader" --tail
```

You can find more information and examples about filtering Lambda function logs in the [SAM CLI Documentation](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-logging.html).

## Cleanup

To delete the sample application that you created, use the AWS CLI. Assuming you used your project name for the stack name, you can run the following:

```bash
sam delete --stack-name "recording-uploader"
```
1 change: 1 addition & 0 deletions admin/recording_uploader/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Init file for the recording_uploader package."""
41 changes: 41 additions & 0 deletions admin/recording_uploader/deploy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
"""Entrypoint to deploy the uploader to AWS Lambda."""

import pathlib
import subprocess

from loguru import logger
import boto3
import fire

CURRENT_DIR = pathlib.Path(__file__).parent


def main(region_name: str = "us-east-1", guided: bool = True) -> None:
"""Deploy the uploader to AWS Lambda.

Args:
region_name (str): The AWS region to deploy the Lambda function to.
guided (bool): Whether to use the guided SAM deployment.
"""
s3 = boto3.client(
"s3",
region_name=region_name,
endpoint_url=f"https://s3.{region_name}.amazonaws.com",
)
bucket = "openadapt"

s3.create_bucket(
ACL="private",
Bucket=bucket,
)

# deploy the code to AWS Lambda
commands = ["sam", "deploy"]
if guided:
commands.append("--guided")
subprocess.run(commands, cwd=CURRENT_DIR, check=True)
logger.info("Lambda function deployed successfully.")


if __name__ == "__main__":
fire.Fire(main)
34 changes: 34 additions & 0 deletions admin/recording_uploader/samconfig.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# More information about the configuration file can be found here:
# https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-config.html
version = 0.1

[default]
[default.global.parameters]
stack_name = "recording-uploader"

[default.build.parameters]
cached = true
parallel = true

[default.validate.parameters]
lint = true

[default.deploy.parameters]
capabilities = "CAPABILITY_IAM"
confirm_changeset = false
resolve_s3 = true
s3_prefix = "recording-uploader"
region = "us-east-1"
image_repositories = []

[default.package.parameters]
resolve_s3 = true

[default.sync.parameters]
watch = true

[default.local_start_api.parameters]
warm_containers = "EAGER"

[default.local_start_lambda.parameters]
warm_containers = "EAGER"
48 changes: 48 additions & 0 deletions admin/recording_uploader/template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
recording-uploader

Sample SAM Template for recording-uploader

# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
Function:
Timeout: 3

Resources:
RecordingUploadFunction:
Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
Properties:
CodeUri: uploader/
Handler: app.lambda_handler
Runtime: python3.10
Architectures:
- x86_64
Events:
RecordingUpload:
Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
Properties:
Path: /upload
Method: post
Policies:
- Statement:
- Sid: S3PutObjectPolicy
Effect: Allow
Action:
- s3:PutObject
Resource: !Sub "arn:aws:s3:::openadapt/*"

Outputs:
# ServerlessRestApi is an implicit API created out of Events key under Serverless::Function
# Find out more about other implicit resources you can reference within SAM
# https://github.com/awslabs/serverless-application-model/blob/master/docs/internals/generated_resources.rst#api
RecordingUploadApi:
Description: "API Gateway endpoint URL for Prod stage for Recording Upload function"
Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/upload/"
RecordingUploadFunction:
Description: "Recording Upload Lambda Function ARN"
Value: !GetAtt RecordingUploadFunction.Arn
RecordingUploadFunctionIamRole:
Description: "Implicit IAM Role created for Recording Upload function"
Value: !GetAtt RecordingUploadFunctionRole.Arn
1 change: 1 addition & 0 deletions admin/recording_uploader/uploader/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Init file for the uploader module."""
60 changes: 60 additions & 0 deletions admin/recording_uploader/uploader/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
"""Lambda function for generating a presigned URL for uploading a recording to S3."""

from typing import Any
from uuid import uuid4
import json

from botocore.client import Config
import boto3

DEFAULT_REGION_NAME = "us-east-1"
DEFAULT_BUCKET = "openadapt"
ONE_HOUR_IN_SECONDS = 3600


def lambda_handler(event: dict, context: Any) -> dict:
"""Main entry point for the lambda function."""
try:
user_id = json.loads(event["body"])["user_id"]
except Exception as e:
print(e)
return {
"statusCode": 400,
"body": json.dumps({"error": "Missing 'user_id' in request body."}),
}
return {
"statusCode": 200,
"body": json.dumps(get_presigned_url(user_id)),
}


def get_presigned_url(
user_id: str, bucket: str = DEFAULT_BUCKET, region_name: str = DEFAULT_REGION_NAME
) -> dict:
"""Generate a presigned URL for uploading a recording to S3.

Args:
bucket (str): The S3 bucket to upload the recording to.
region_name (str): The AWS region the bucket is in.

Returns:
dict: A dictionary containing the presigned URL.
"""
s3 = boto3.client(
"s3",
config=Config(signature_version="s3v4"),
region_name=region_name,
endpoint_url=f"https://s3.{region_name}.amazonaws.com",
)
key = f"recordings/{user_id}/{uuid4()}.zip"

presigned_url = s3.generate_presigned_url(
ClientMethod="put_object",
Params={
"Bucket": bucket,
"Key": key,
},
ExpiresIn=ONE_HOUR_IN_SECONDS,
)

return {"url": presigned_url}
1 change: 1 addition & 0 deletions admin/recording_uploader/uploader/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
boto3==1.34.84
10 changes: 10 additions & 0 deletions openadapt/app/dashboard/api/recordings.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,13 @@
from fastapi import APIRouter, WebSocket

from openadapt.custom_logger import logger
from openadapt.config import config
from openadapt.db import crud
from openadapt.deprecated.app import cards
from openadapt.events import get_events
from openadapt.models import Recording
from openadapt.plotting import display_event
from openadapt.share import upload_recording_to_s3
from openadapt.utils import image2utf8, row2dict


Expand All @@ -29,6 +31,9 @@ def attach_routes(self) -> APIRouter:
self.app.add_api_route("/start", self.start_recording)
self.app.add_api_route("/stop", self.stop_recording)
self.app.add_api_route("/status", self.recording_status)
self.app.add_api_route(
"/{recording_id}/upload", self.upload_recording, methods=["POST"]
)
self.recording_detail_route()
return self.app

Expand Down Expand Up @@ -63,6 +68,11 @@ def recording_status() -> dict[str, bool]:
"""Get the recording status."""
return {"recording": cards.is_recording()}

def upload_recording(self, recording_id: int) -> dict[str, str]:
"""Upload a recording."""
upload_recording_to_s3(config.UNIQUE_USER_ID, recording_id)
return {"message": "Recording uploaded"}

def recording_detail_route(self) -> None:
"""Add the recording detail route as a websocket."""

Expand Down
8 changes: 1 addition & 7 deletions openadapt/app/dashboard/app/providers.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { get } from '@/api'
import posthog from 'posthog-js'
import { PostHogProvider } from 'posthog-js/react'
import { useEffect } from 'react'
import { getSettings } from './utils'

if (typeof window !== 'undefined') {
if (process.env.NEXT_PUBLIC_MODE !== "development") {
Expand All @@ -12,13 +13,6 @@ if (typeof window !== 'undefined') {
}
}

async function getSettings(): Promise<Record<string, string>> {
return get('/api/settings?category=general', {
cache: 'no-store',
})
}


export function CSPostHogProvider({ children }: { children: React.ReactNode }) {
useEffect(() => {
if (process.env.NEXT_PUBLIC_MODE !== "development") {
Expand Down
Loading
Loading