Hash model inputs instead of parameters #324

danielholanda · 2023-06-14T17:43:08Z

Closes #322

Overview

This PR stops hashing the models using weights and hashes input shapes instead.

Description

This PR closes #322 by implement the long-term solution and revert the workaround mentioned here: #316 (comment) (closing #316 in the same PR).

It also adds a test to test/analysis.py to prevent this kind of break in the future.

Manually testing

CI tests were added. However, you may manually test this PR by running benchit two_calls.py --analyze-only.

`two_calls.py`

# labels: name::mobilenetv2_035 author::timm task::Computer_Vision
import torch
import timm
from mlagility.parser import parse

# Parsing command-line arguments
batch_size = parse(["batch_size"])

# Creating model and set it to evaluation mode
model = timm.create_model("mobilenetv2_035", pretrained=False)
model.eval()

# Creating inputs
inputs1 = torch.rand((1, 3, 28, 28))
inputs2 = torch.rand((1, 3, 224, 224))

# Calling model
model(inputs1)
model(inputs2)

Expected results:

Models discovered during profiling:

two_input_shapes.py:
        model (executed 1x - 0.01s)
                Model Type:     Pytorch (torch.nn.Module)
                Class:          EfficientNet (<class 'timm.models.efficientnet.EfficientNet'>)
                Location:       /home/dhnoronha/.local/lib/python3.8/site-packages/timm/models/efficientnet_builder.py, line 472
                Parameters:     1,677,128 (3.2 MB)
                Input Shape:    {'Positional Arg 1': (1, 3, 28, 28)}
                Hash:           b63f3042

        model (executed 1x - 0.03s)
                Model Type:     Pytorch (torch.nn.Module)
                Class:          EfficientNet (<class 'timm.models.efficientnet.EfficientNet'>)
                Location:       /home/dhnoronha/.local/lib/python3.8/site-packages/timm/models/efficientnet_builder.py, line 472
                Parameters:     1,677,128 (3.2 MB)
                Input Shape:    {'Positional Arg 1': (1, 3, 224, 224)}
                Hash:           062ae984


Woohoo! The 'benchmark' command is complete.

Note: Input Shape is only displayed when a model has more than one workload

src/mlagility/api/Dockerfile

danielholanda · 2023-06-16T00:53:30Z

Should we start differentiating between workloads and models in our documentation?

Examples:

Replace Each model in a script is identified by a unique hash with Each workload in a script is identified by a unique hash. (user guide)
Replace Models discovered during profiling: with Workloads discovered during profiling: (cli message)

jeremyfowers

Discuss architecture

src/mlagility/analysis/analysis.py

src/mlagility/analysis/status.py

…to robust_hashing

danielholanda · 2023-06-17T00:04:31Z

Heads up: Significant changes were made to ensure that analysis also works when we have the same models with multiple inputs AND max-depth is > 1. Here is one example of things working correctly.

Note that the shapes and hashes of the "deeper" models are different when you compare the two workloads.

two_input_shapes.py:
        model (executed 1x - 0.18s)
                Model Type:     Pytorch (torch.nn.Module)
                Class:          EfficientNet (<class 'timm.models.efficientnet.EfficientNet'>)
                Location:       /home/dhnoronha/.local/lib/python3.8/site-packages/timm/models/efficientnet_builder.py, line 472
                Parameters:     1,677,128 (3.2 MB)
                Input Shape:    'Positional Arg 1': (1, 3, 28, 28)
                Hash:           4c4868c4

                        blocks (executed 1x - 0.17s)
                                Model Type:     Pytorch (torch.nn.Module)
                                Class:          Sequential (<class 'torch.nn.modules.container.Sequential'>)
                                Parameters:     249,744 (0.5 MB)
                                Input Shape:    'Positional Arg 1': (1, 16, 14, 14)
                                Hash:           2b4daad8

                        bn2 (executed 1x - 0.00s)
                                Model Type:     Pytorch (torch.nn.Module)
                                Class:          BatchNormAct2d (<class 'timm.models.layers.norm_act.BatchNormAct2d'>)
                                Parameters:     2,560 (<0.1 MB)
                                Input Shape:    'Positional Arg 1': (1, 1280, 1, 1)
                                Hash:           26e367bf

        model (executed 1x - 0.22s)
                Model Type:     Pytorch (torch.nn.Module)
                Class:          EfficientNet (<class 'timm.models.efficientnet.EfficientNet'>)
                Location:       /home/dhnoronha/.local/lib/python3.8/site-packages/timm/models/efficientnet_builder.py, line 472
                Parameters:     1,677,128 (3.2 MB)
                Input Shape:    'Positional Arg 1': (1, 3, 224, 224)
                Hash:           cbdf10c5

                        blocks (executed 1x - 0.21s)
                                Model Type:     Pytorch (torch.nn.Module)
                                Class:          Sequential (<class 'torch.nn.modules.container.Sequential'>)
                                Parameters:     249,744 (0.5 MB)
                                Input Shape:    'Positional Arg 1': (1, 16, 112, 112)
                                Hash:           7714ae20

                        bn2 (executed 1x - 0.00s)
                                Model Type:     Pytorch (torch.nn.Module)
                                Class:          BatchNormAct2d (<class 'timm.models.layers.norm_act.BatchNormAct2d'>)
                                Parameters:     2,560 (<0.1 MB)
                                Input Shape:    'Positional Arg 1': (1, 1280, 7, 7)
                                Hash:           83405a1c

danielholanda · 2023-06-17T00:06:14Z

@jeremyfowers I converted this PR back to draft since I intend to change the messages displayed to the user as discussed in #324 (comment)

jeremyfowers

Looking great, this is pretty much good to go! Just doing "request changes" so that I have a chance to check the documentation addition before this merges.

src/mlagility/analysis/status.py

models/llm_layer/llama_layer_prototype.py

src/mlagility/analysis/analysis.py

src/mlagility/analysis/status.py

src/mlagility/version.py

This reverts commit cdfef11.

This reverts commit 60862bd.

This reverts commit b4b6371.

jeremyfowers

Thank you for all the brainstorming and multiple rounds of feedback! This change wound up in a really good place.

I pushed a few commits with minor copy-editing and I added your new example to CI (no functionality changes).

Basic input hashing

e417eba

danielholanda self-assigned this Jun 14, 2023

Daniel Holanda Noronha added 11 commits June 14, 2023 13:44

Showing workload status correctly

5b9b188

Showing workload hash rather than model hash

7997a9b

Temporarily modifying docker file to enalbe CI

056b90b

Merge branch 'main' into robust_hashing

a03f183

Merge main into branch

ca64739

Revert fs changes

7d29c5a

Robust shape extraction

3c6cb9e

Update ci test hash

d3fece5

Update analysis CI

9da81c9

Add test

8aa05f7

Updated dockerfile

52c5e16

danielholanda commented Jun 15, 2023

View reviewed changes

src/mlagility/api/Dockerfile Outdated Show resolved Hide resolved

Daniel Holanda Noronha added 2 commits June 15, 2023 17:07

Add requirement

6aaa2f0

Added input shape to print

e02df17

danielholanda marked this pull request as ready for review June 16, 2023 00:48

danielholanda requested review from jeremyfowers, ramkrishna2910 and vgodsoe-groq as code owners June 16, 2023 00:48

Merge branch 'main' into robust_hashing

cc8f5b6

jeremyfowers suggested changes Jun 16, 2023

View reviewed changes

Daniel Holanda Noronha added 6 commits June 16, 2023 12:08

Simplify llama code

8593f8b

Merge branch 'robust_hashing' of https://github.com/groq/mlagility in…

46993ed

…to robust_hashing

recursively printing for each model

676036f

Keeping track of parent workload hash

f6fd7f1

Correctly printing when max_depth is set

e462373

Ensure that hashes are different if they come from different workloads

5f1ba1d

danielholanda marked this pull request as draft June 17, 2023 00:04

Daniel Holanda Noronha added 4 commits June 16, 2023 17:10

Fix CI

417c701

Correctly keeping track of last workload executed

b2be4bb

Better UI

c972930

Added test

6abf7ac

danielholanda marked this pull request as ready for review June 20, 2023 23:45

Renamed function as suggested

0d34fa7

jeremyfowers suggested changes Jun 21, 2023

View reviewed changes

src/mlagility/analysis/status.py Outdated Show resolved Hide resolved

models/llm_layer/llama_layer_prototype.py Outdated Show resolved Hide resolved

src/mlagility/analysis/analysis.py Outdated Show resolved Hide resolved

src/mlagility/analysis/status.py Outdated Show resolved Hide resolved

jeremyfowers reviewed Jun 21, 2023

View reviewed changes

src/mlagility/version.py Outdated Show resolved Hide resolved

Daniel Holanda Noronha and others added 14 commits June 21, 2023 10:54

Change model to workload where appropriate

b4b6371

Fix CI

60862bd

Fix slurm CI

cdfef11

Revert "Fix slurm CI"

328938f

This reverts commit cdfef11.

Revert "Fix CI"

7c6668d

This reverts commit 60862bd.

Revert "Change model to workload where appropriate"

2651512

This reverts commit b4b6371.

Suggested changes

5d85530

Replacing the term workloads by invocations

c67f140

Add documentation for new feature

336d563

merge main into branch

6c06115

Merge branch 'main' into robust_hashing

e939b8f

Fix tutorial typo

17f0e69

Copy editing the multiple invokation tutorial

18aca16

Note issue in the code

daf13a8

jeremyfowers approved these changes Jun 22, 2023

View reviewed changes

jeremyfowers merged commit f35e27c into main Jun 22, 2023

jeremyfowers deleted the robust_hashing branch June 22, 2023 15:27

jeremyfowers mentioned this pull request Jun 23, 2023

Add cache and build name to status report #333

Merged

ramkrishna2910 mentioned this pull request Jun 23, 2023

Add verbosity option for modulating the size of status print outs #334

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hash model inputs instead of parameters #324

Hash model inputs instead of parameters #324

danielholanda commented Jun 14, 2023 •

edited

Loading

danielholanda commented Jun 16, 2023

jeremyfowers left a comment

danielholanda commented Jun 17, 2023 •

edited

Loading

danielholanda commented Jun 17, 2023

jeremyfowers left a comment

jeremyfowers left a comment

Hash model inputs instead of parameters #324

Hash model inputs instead of parameters #324

Conversation

danielholanda commented Jun 14, 2023 • edited Loading

Overview

Description

Manually testing

two_calls.py

Expected results:

danielholanda commented Jun 16, 2023

jeremyfowers left a comment

Choose a reason for hiding this comment

danielholanda commented Jun 17, 2023 • edited Loading

danielholanda commented Jun 17, 2023

jeremyfowers left a comment

Choose a reason for hiding this comment

jeremyfowers left a comment

Choose a reason for hiding this comment

danielholanda commented Jun 14, 2023 •

edited

Loading

`two_calls.py`

danielholanda commented Jun 17, 2023 •

edited

Loading