Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load models from Object Store #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
172 changes: 118 additions & 54 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -5,8 +5,8 @@ authors = []
edition = "2018"

[dependencies]
fastly = "0.8.7"
fastly = "0.9.1"
# don't fancy tensorflow? Try: tract-kaldi, tract-onnx, tract-nnef
tract_flavour = { package = "tract-tensorflow", version = "0.17.7" }
image = { version = "0.24.3", default-features = false, features = ["jpeg"] }
rand = "0.8.3"
tract_flavour = { package = "tract-tensorflow", version = "0.19.7" }
image = { version = "0.24.5", default-features = false, features = ["jpeg"] }
rand = "0.8.5"
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -2,21 +2,21 @@

It's [here](https://developer.fastly.com/solutions/demos/edgeml/)!

This targets `wasm32-wasi` for [Fastly's Compute@Edge](https://www.fastly.com/products/edge-compute/serverless). It uses an extrinsic tool, [`wasm-opt`](https://github.com/WebAssembly/binaryen#tools), to squeeze - among other things – an ML [inference engine](https://en.wikipedia.org/wiki/Inference_engine) into a 35MB-ish [wasm](https://webassembly.org/) binary.
This targets `wasm32-wasi` for [Fastly's Compute@Edge](https://www.fastly.com/products/edge-compute/serverless). It uses an external tool, [`wasm-opt`](https://github.com/WebAssembly/binaryen#tools), to squeeze - among other things – an ML [inference engine](https://en.wikipedia.org/wiki/Inference_engine) into a 35MB-ish [wasm](https://webassembly.org/) binary.

This demo showcases image classification using a [top-tier MobileNetV2 checkpoint](https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet). Owing to the flexibility of [`tract`](https://github.com/sonos/tract) under the hood, the [TensorFlow Lite](https://www.tensorflow.org/lite/guide/hosted_model) model deployed can be swapped for another, including open interchange formats ([ONNX](https://onnx.ai/) / [NNEF](https://www.khronos.org/nnef)).

This demo was created to push the boundaries of the platform and inspire new ideas.

## Publishing end-to-end

Using the Fastly CLI, publish the root package and note the `[funky-domain].edgecompute.app`:
Using the Fastly CLI, publish the root package and note the `[your-random-subdomain].edgecompute.app`:

```sh
fastly compute publish
```

Update L54 in [`docs/script.js`](./docs/script.js) to `[funky-domain].edgecompute.app` you just noted, and publish the static demo site separately:
Update L1 in [`docs/script.js`](./docs/script.js) to `const ML_BACKEND = "[your-random-subdomain].edgecompute.app"`, and publish the static demo site separately:

```sh
cd static-host
5 changes: 3 additions & 2 deletions docs/script.js
8 changes: 6 additions & 2 deletions fastly.toml
Original file line number Diff line number Diff line change
@@ -2,7 +2,11 @@
# https://developer.fastly.com/reference/fastly-toml/

authors = ["<dmilitaru@fastly.com>"]
description = "Machine learning inference at the edge"
description = "Machine learning inference at the edge and Object Store"
language = "rust"
manifest_version = 2
name = "edgeml"
name = "edgeml-objstore"
service_id = "oofCZrn4JHkjdtm9lySZK1"

# [local_server]
# object_store.models = []
Binary file removed models/compiled.nnef
Binary file not shown.
Binary file removed models/mobilenet_v1_0.5_224_frozen.pb
Binary file not shown.
Binary file removed models/mobilenet_v2_1.0_224_frozen.pb
Binary file not shown.
6 changes: 3 additions & 3 deletions pkg.sh
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#!/bin/bash
fastly compute build --force
fastly compute build

BUNDLE_DIR=pkg
PROJECT=hackadora
PROJECT=edgeml-objstore
PKDIR=$BUNDLE_DIR/$PROJECT

# Create a bundle directory
@@ -19,7 +19,7 @@ cp Cargo.toml $PKDIR

# Optimise the wasm some more
# https://github.com/WebAssembly/binaryen#tools
wasm-opt target/wasm32-wasi/release/$PROJECT.wasm -O -o $PKDIR/bin/main.wasm
wasm-opt bin/main.wasm -O -o $PKDIR/bin/main.wasm

# Archive the directory
(cd $BUNDLE_DIR && tar -czf $PROJECT.tar.gz $PROJECT)
10 changes: 0 additions & 10 deletions src/log.rs

This file was deleted.

51 changes: 27 additions & 24 deletions src/main.rs
Original file line number Diff line number Diff line change
@@ -1,39 +1,42 @@
mod log;
mod ml;

use fastly::http::{Method, StatusCode};
use fastly::{Error, Request, Response};
use log::emit_log;
use fastly::{Error, Request, Response, ObjectStore};

const ML_MODEL: &str = "mobilenet_v2_1.4_224";

#[fastly::main]
fn main(mut req: Request) -> Result<Response, Error> {
// `models` Object Store.
let models = ObjectStore::open("models")?.unwrap();

let mut resp = Response::new()
.with_header("Access-Control-Allow-Origin", "*")
.with_header("Access-Control-Allow-Headers", "Content-Type");
let session = req.get_query_str().unwrap_or("session=")[8..].to_owned();
let context = "main";

match (req.get_method(), req.get_header_str("Content-Type")) {
(&Method::POST, Some("image/jpeg")) => {
emit_log(
context,
&session,
"Loading model mobilenet_v2_1.4_224 (ImageNet).",
);
let model = include_bytes!("../models/mobilenet_v2_1.4_224_frozen.pb");
match ml::infer(model, &req.take_body_bytes(), &session) {
Ok((confidence, label_index)) => {
emit_log(
context,
&session,
&format!("Image classified! ImageNet label index {} (confidence {:2}).", label_index, confidence)
);
resp.set_body_text_plain(&format!("{},{}", confidence, label_index));
}
Err(e) => {
emit_log(context, &session, &format!("Inference error: {:?}", e));
resp.set_body_text_plain(&format!("errored: {:?}", e));
// To use a model, load it from the Object Store.
match models.lookup_bytes(ML_MODEL) {
Ok(Some(model)) => {
println!("Loaded model {} from Object Store.", ML_MODEL);
match ml::infer(&model, &req.take_body_bytes()) {
Ok((confidence, label_index)) => {
println!("Image classified! ImageNet label index {} (confidence {:2}).", label_index, confidence);
resp.set_body_text_plain(&format!("{},{}", confidence, label_index));
}
Err(e) => {
eprintln!("Inference error: {:?}", e);
resp.set_body_text_plain(&format!("errored: {:?}", e));
}
}
},
_ => {
resp.set_status(StatusCode::INTERNAL_SERVER_ERROR);
resp.set_body_text_plain(&format!("Failed to load model {} from Object Store.", ML_MODEL));
}
}
};

}
(&Method::OPTIONS, _) => resp.set_status(StatusCode::OK),
_ => resp.set_status(StatusCode::IM_A_TEAPOT),
34 changes: 8 additions & 26 deletions src/ml.rs
Original file line number Diff line number Diff line change
@@ -1,56 +1,38 @@
use std::io::Cursor;
use tract_flavour::prelude::*;
use crate::log::emit_log;

// The inference function returns a tuple:
// (confidence, index of the predicted class)
pub fn infer(model_bytes: &[u8], image_bytes: &[u8], session: &str) -> TractResult<(f32,i32)> {
let context = "inference_engine";
emit_log(
context,
session,
"Optimizing runnable Tensorflow model for F32 datum type, tensor shape [1, 224, 224, 3].",
);
pub fn infer(model_bytes: &[u8], image_bytes: &[u8]) -> TractResult<(f32, i32)> {
println!("Optimizing Tensorflow model for F32 datum type, tensor shape [1, 224, 224, 3].");
let model = tract_flavour::tensorflow() // swap in ::nnef() for the tract-nnef package, etc.
// Load the model.
.model_for_read(&mut Cursor::new(model_bytes))?
// Specify input type and shape.
.with_input_fact(0,InferenceFact::dt_shape(f32::datum_type(), tvec!(1, 224, 224, 3)))?
.with_input_fact(0, f32::fact(&[1, 224, 224, 3]).into())?
// Optimize the model.
.into_optimized()?
// Make the model runnable and fix its inputs and outputs.
.into_runnable()?;

// Create a new image from the image byte slice.
let img = image::load_from_memory(image_bytes)?.to_rgb8();
emit_log(
context,
session,
"Resizing image to fit 224x224 (filter algorithm: nearest neighbour).",
);
println!("Resizing image to fit 224x224 (filter algorithm: nearest neighbour).",);
// Resize the input image to the dimension the model was trained on.
// Sampling filter and performance comparison: https://docs.rs/image/0.23.12/image/imageops/enum.FilterType.html#examples
// Switch to FilterType::Triangle if you're getting odd results.
let resized = image::imageops::resize(&img, 224, 224, image::imageops::FilterType::Nearest);

emit_log(
context,
session,
"Converting scaled image to tensor and running model...",
);
println!("Converting scaled image to tensor and running model...",);
// Make a Tensor out of it.
let img: Tensor = tract_ndarray::Array4::from_shape_fn((1, 224, 224, 3), |(_, y, x, c)| {
resized[(x as _, y as _)][c] as f32 / 255.0
})
.into();

// Run the model on the input.
let result = model.run(tvec!(img))?;
emit_log(
context,
session,
&format!("Inference complete. Traversing results graph to find a best-confidence fit...")
);
let result = model.run(tvec!(img.into()))?;
println!("Inference complete. Traversing results graph to find a best-confidence fit...");

// Find the max value with its index.
let best = result[0]
@@ -59,6 +41,6 @@ pub fn infer(model_bytes: &[u8], image_bytes: &[u8], session: &str) -> TractResu
.cloned()
.zip(1..)
.max_by(|a, b| a.0.partial_cmp(&b.0).unwrap());

Ok(best.unwrap())
}
43 changes: 22 additions & 21 deletions static-host/Cargo.lock
8 changes: 4 additions & 4 deletions static-host/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[package]
name = "compute-starter-kit-rust-default"
name = "edgeml-demo"
version = "0.1.0"
authors = []
edition = "2018"
@@ -11,6 +11,6 @@ publish = false
debug = 1

[dependencies]
fastly = "0.8.7"
include_dir = "0.7.2"
mime_guess = "2.0.3"
fastly = "0.9.1"
include_dir = "0.7.3"
mime_guess = "2.0.4"