continued refactoring of feature extractor and classifier #38

keighrim · 2023-12-12T08:25:29Z

more fix for #31

Cherry picked old commits of mine, and tried to resolved all the conflicts with current code. @marcverhagen could you verify the code runs? A few notes;

I changed many config key names. Please find the full list in the modeling/config/classifier-full.yml file
there is also modeling/config/classifier-no-position.yml symlinked to example-config.yml that has configs for the "old" model w/o pos_enc
I merged two config files (model config and classifier config) into just classifier config yaml file. I'm not sure how that's going to impact the 'export' code for the model configs in the train.py module, though.

Let me know if you have questions.

marcverhagen · 2023-12-12T18:42:58Z

There are a few issues with running the classier, most of them so far seem minor:

Both app.py and classify.py have a default config file that does not exist. I ran classify.py, using the configuration in modeling/config/classifier-no-positional.yml.
There is a warning that softmax likes a dim argument, but I do not remember having seen this before.
The code relies on "other" being included in "self.labels", but it isn't, so that throws an error when the other category has the highest score. Maybe revisit the choice to not include "other".
The prediction now is just the value for the highest scoring label, downstream processing needs all of them. This actually makes the previous problem go away, but I still want to revisit the labels.

With some poking around in the code I could have the classifier at least spit out the frame predictions, making it work with the downstream knitting code needs a few more little edits.

I have not yet tried to create a new model and use that.

About combining model config and classifier config...

I assume classifier config means those settings that effect how the classifier operates, like frameRate, and that model config refers to settings like num_layers and dropout. If we combine them then it has to be made clear that the user cannot simply change the latter since they are inherent to the model chosen. Also, I thought the model settings were saved with the model when it was created (alongside the results file), which is why I had model_file and model_config:

model_file: "modeling/models/20231026-164841.kfold_000.pt"
model_config: "modeling/models/20231026-164841.config.yml"

I wasn't necessarily happy with that and was thinking about just having

model: "modeling/models/20231026-164841.kfold_000.pt"

and have the code figure out where to find the configuration of that model.

Merging the two does not impact the trainer's export code, but we do now manually take some of the settings from the config export and add them to another config file. It looks like we now would need to manually update the configurations when we pick a different model.

marcverhagen · 2023-12-13T14:51:08Z

@keighrim Are the configuration settings in modeling/config/trainer.yml the ones that I should use for generating the new model to include as the default model in the app?

…se for trainer

marcverhagen · 2023-12-14T19:07:03Z

I ran the classifier again using the positional model that I created yesterday, same error:

python classify.py --config modeling/config/classifier-test.yaml --input modeling/data/cpb-aacip-690722078b2-0000-0100.mp4 
Traceback (most recent call last):
  File "/Users/marc/Documents/git/clams/app-swt-detection/classify.py", line 289, in <module>
    classifier = Classifier(**yaml.safe_load(open(args.config)))
  File "/Users/marc/Documents/git/clams/app-swt-detection/classify.py", line 42, in __init__
    self.classifier.load_state_dict(torch.load(config["model_file"]))
  File "/Applications/ADDED/venv/clams/app-swt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Sequential:
	size mismatch for fc1.weight: copying a param with shape torch.Size([128, 1280]) from checkpoint, the shape in current model is torch.Size([128, 60768]).

The config file used here was the same as classifier-full.yml except for:

it uses a different model
it uses convnext_timy instead of convnext_lg
it uses [ "slate", "chyron", "credit"] as the labels

It did still have a reference to "other" in the labels list so I got rid of that. This gave more errors:

python classify.py --config modeling/config/classifier-test.yaml --input modeling/data/cpb-aacip-690722078b2-0000-0100.mp4 
Traceback (most recent call last):
  File "/Users/marc/Documents/git/clams/app-swt-detection/classify.py", line 289, in <module>
    classifier = Classifier(**yaml.safe_load(open(args.config)))
  File "/Users/marc/Documents/git/clams/app-swt-detection/classify.py", line 42, in __init__
    self.classifier.load_state_dict(torch.load(config["model_file"]))
  File "/Applications/ADDED/venv/clams/app-swt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Sequential:
	size mismatch for fc1.weight: copying a param with shape torch.Size([128, 1280]) from checkpoint, the shape in current model is torch.Size([128, 60768]).
	size mismatch for fc_out.weight: copying a param with shape torch.Size([4, 64]) from checkpoint, the shape in current model is torch.Size([3, 64]).
	size mismatch for fc_out.bias: copying a param with shape torch.Size([4]) from checkpoint, the shape in current model is torch.Size([3]).

I also tried creating the model again:

python -m modeling.train -c modeling/config/trainer.yml features/feature-extraction
2023-12-14 13:58:07 __main__ INFO     4595099136 Using config: {'num_epochs': 5, 'num_splits': 5, 'img_enc_name': 'convnext_tiny', 'block_guids_train': ['cpb-aacip-254-75r7szdz'], 'block_guids_valid': ['cpb-aacip-254-75r7szdz', 'cpb-aacip-259-4j09zf95', 'cpb-aacip-526-hd7np1xn78', 'cpb-aacip-75-72b8h82x', 'cpb-aacip-fe9efa663c6', 'cpb-aacip-f5847a01db5', 'cpb-aacip-f2a88c88d9d', 'cpb-aacip-ec590a6761d', 'cpb-aacip-c7c64922fcd', 'cpb-aacip-f3fa7215348', 'cpb-aacip-f13ae523e20', 'cpb-aacip-e7a25f07d35', 'cpb-aacip-ce6d5e4bd7f', 'cpb-aacip-690722078b2', 'cpb-aacip-e649135e6ec', 'cpb-aacip-15-93gxdjk6', 'cpb-aacip-512-4f1mg7h078', 'cpb-aacip-512-4m9183583s', 'cpb-aacip-512-4b2x34nt7g', 'cpb-aacip-512-3n20c4tr34', 'cpb-aacip-512-3f4kk9534t'], 'num_layers': 3, 'dropouts': 0.1, 'pos_enc_name': 'sinusoidal-concat', 'pos_unit': 60000, 'pos_enc_dim': 512, 'pos_max_input_length': 5640000, 'bins': {'pre': {'slate': ['S'], 'chyron': ['I', 'N', 'Y'], 'credit': ['C']}}}
2023-12-14 13:58:07 __main__ WARNING  4595099136 sinusoidal-concat
2023-12-14 13:58:08 __main__ INFO     4595099136 train: 0 videos, 0 images, valid: 0 videos, 0 images
2023-12-14 13:58:08 __main__ INFO     4595099136 Skipping fold 0 due to lack of data
2023-12-14 13:58:08 __main__ WARNING  4595099136 sinusoidal-concat
2023-12-14 13:58:08 __main__ INFO     4595099136 train: 0 videos, 0 images, valid: 0 videos, 0 images
2023-12-14 13:58:08 __main__ INFO     4595099136 Skipping fold 1 due to lack of data
2023-12-14 13:58:08 __main__ WARNING  4595099136 sinusoidal-concat
2023-12-14 13:58:09 __main__ INFO     4595099136 train: 0 videos, 0 images, valid: 0 videos, 0 images
2023-12-14 13:58:09 __main__ INFO     4595099136 Skipping fold 2 due to lack of data
2023-12-14 13:58:09 __main__ WARNING  4595099136 sinusoidal-concat
2023-12-14 13:58:09 __main__ INFO     4595099136 train: 0 videos, 0 images, valid: 0 videos, 0 images
2023-12-14 13:58:09 __main__ INFO     4595099136 Skipping fold 3 due to lack of data
2023-12-14 13:58:09 __main__ WARNING  4595099136 sinusoidal-concat
2023-12-14 13:58:10 __main__ INFO     4595099136 train: 0 videos, 0 images, valid: 0 videos, 0 images
2023-12-14 13:58:10 __main__ INFO     4595099136 Skipping fold 4 due to lack of data

And after that we get an error.

…urally sortable

… regarding negative labels in configs

keighrim · 2023-12-15T01:59:59Z

New commits contain many fixes including a fix for the positional encoder bug that ended up with 60768-dimensional vectors.
Replaced built-in models in modeling/models directory, trained with the included trainer.yml config file, and should be compatible with the included classifier.yml config file.

marcverhagen · 2023-12-15T03:43:36Z

With the latest changes the classifier now runs on the convnext model with positional encodings. After some more testing I will merge this into the 14-clamsapp branch and prepare a new app version.

marcverhagen · 2023-12-15T03:44:32Z

... or I may just review this pull request so it can be merged into develop

keighrim · 2023-12-15T15:35:27Z

I made one more small change before merging this. (please find the latest commit msg helpful).

keighrim added 3 commits December 12, 2023 09:22

some renaming after refactoring

72889d6

more renaming of config keys and files

e71825d

updated classifier to use new config keys and pos_enc configs

e7ab7c8

keighrim requested a review from marcverhagen December 12, 2023 08:25

keighrim changed the title ~~continued refactoring of~~ continued refactoring of feature extractor and classifier Dec 12, 2023

Some changes to make the classifier run

967918f

marcverhagen and others added 3 commits December 13, 2023 16:10

trainer fixes ; moved maximum length parameter to config file

6ab33ba

fixed feature dimension mismatch

8b76ce3

labels/binning configs for classifier is now more consistent with tho…

2634970

…se for trainer

keighrim added 4 commits December 14, 2023 19:21

fixed train_id 1) not being consistent when exporting configs, 2) nat…

a1d33b7

…urally sortable

reverting config key name change

33ea1f7

refactored negative label name

39d859f

classifier config now re-uses pre-exported trainer config, more fixes…

59ef53e

… regarding negative labels in configs

marcverhagen added 2 commits December 15, 2023 09:03

Fixing configuration loading

570ca0c

Restoring debugging method

1df5017

marcverhagen approved these changes Dec 15, 2023

View reviewed changes

made classify as a submodule under modeling package.

ff30046

keighrim merged commit f033551 into develop Dec 15, 2023

keighrim mentioned this pull request Dec 15, 2023

update classify.py with positional encoding implementation #28

Closed

keighrim deleted the refactor-feat-extractor branch December 17, 2023 01:50

marcverhagen restored the refactor-feat-extractor branch February 7, 2024 21:13

marcverhagen deleted the refactor-feat-extractor branch February 7, 2024 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

continued refactoring of feature extractor and classifier #38

continued refactoring of feature extractor and classifier #38

keighrim commented Dec 12, 2023 •

edited

Loading

marcverhagen commented Dec 12, 2023

marcverhagen commented Dec 13, 2023

marcverhagen commented Dec 14, 2023

keighrim commented Dec 15, 2023

marcverhagen commented Dec 15, 2023

marcverhagen commented Dec 15, 2023

keighrim commented Dec 15, 2023

continued refactoring of feature extractor and classifier #38

continued refactoring of feature extractor and classifier #38

Conversation

keighrim commented Dec 12, 2023 • edited Loading

marcverhagen commented Dec 12, 2023

marcverhagen commented Dec 13, 2023

marcverhagen commented Dec 14, 2023

keighrim commented Dec 15, 2023

marcverhagen commented Dec 15, 2023

marcverhagen commented Dec 15, 2023

keighrim commented Dec 15, 2023

keighrim commented Dec 12, 2023 •

edited

Loading