image-based data preprocessing #112

keighrim · 2024-07-19T14:31:10Z

New Feature Summary

At the moment, the data preprocessor expects one video file and one CSV file of manual label annotation

app-swt-detection/modeling/data_loader.py

Lines 249 to 254 in 7be4b81

    
           parser.add_argument("-i", "--input-video", 
        
                               help="filepath for the video to be processed.", 
        
                               required=True) 
        
           parser.add_argument("-c", "--annotation-csv", 
        
                               help="filepath for the csv containing timepoints + labels.", 
        
                               required=True)

to prepare CNN vectors and a metadata json file

app-swt-detection/modeling/data_loader.py

Lines 240 to 243 in 7be4b81

    
           with open(f"{args.outdir}/{feat_metadata['guid']}.json", 'w', encoding='utf8') as f: 
        
               json.dump(feat_metadata, f) 
        
           for name, vectors in feat_mats.items(): 
        
               np.save(f"{args.outdir}/{feat_metadata['guid']}.{name}", vectors)

However, we are now receiving additional annotations from GBH that's done on more videos but in much sparser way. And most importantly, the video file is not a part of delivery package, but extracted frame images are.

To cope with the different data situation for next rounds of training, we need to update the data preprocessor to handle the new batches of annotations.

Additional context

Current train-ready preprocessed data looks like this;

$ ls  feature-extraction/cpb-aacip-f3fa7215348*
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.bn_vgg16.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.bn_vgg19.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_base.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_lg.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_small.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_tiny.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.densenet121.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.efficientnet_large.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.efficientnet_med.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.efficientnet_small.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.json
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet101.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet152.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet18.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet50.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.vgg16.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.vgg19.npy

Here's where the preprocessed data is read

app-swt-detection/modeling/train.py

Lines 129 to 147 in 7be4b81

    
           for j in Path(indir).glob('*.json'): 
        
               guid = j.with_suffix("").name 
        
               feature_vecs = np.load(Path(indir) / f"{guid}.{configs['img_enc_name']}.npy") 
        
               labels = json.load(open(Path(indir) / f"{guid}.json")) 
        
               total_video_len = labels['duration'] 
        
               for i, vec in enumerate(feature_vecs): 
        
                   if not labels['frames'][i]['mod']:  # "transitional" frames 
        
                       pre_binned_label = pretraining_bin(labels['frames'][i]['label'], configs) 
        
                       vector = torch.from_numpy(vec) 
        
                       position = labels['frames'][i]['curr_time'] 
        
                       vector = extractor.encode_position(position, total_video_len, vector) 
        
                       if guid in validation_guids: 
        
                           valid_vimg += 1 
        
                           valid_vectors.append(vector) 
        
                           valid_labels.append(pre_binned_label) 
        
                       elif guid in train_guids: 
        
                           train_vimg += 1 
        
                           train_vectors.append(vector) 
        
                           train_labels.append(pre_binned_label)

And finally, due to the sparsity of the annotation work for next batches, we need to add new GUIDs to this list

app-swt-detection/modeling/gridsearch.py

Lines 23 to 24 in 7be4b81

    
           block_guids_valid = [ 
        
               [                               # block all loosely-annotated videos

The text was updated successfully, but these errors were encountered:

keighrim · 2024-07-22T16:46:36Z

For the sake of implementation, let's change

 parser.add_argument("-i", "--input-video", 
                     help="filepath for the video to be processed.", 
                     required=True)

to be ether a single video file name or a directory name with lots of image files.

keighrim added the ✨N New feature or request label Jul 19, 2024

clams-bot assigned 1192119703jzx Jul 22, 2024

1192119703jzx mentioned this issue Jul 22, 2024

Enable image-based data preprocessing in data_loader.py #115

Merged

keighrim closed this as completed in #115 Sep 2, 2024

clams-bot unassigned 1192119703jzx Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image-based data preprocessing #112

image-based data preprocessing #112

keighrim commented Jul 19, 2024 •

edited

Loading

keighrim commented Jul 22, 2024

image-based data preprocessing #112

image-based data preprocessing #112

Comments

keighrim commented Jul 19, 2024 • edited Loading

New Feature Summary

Additional context

keighrim commented Jul 22, 2024

keighrim commented Jul 19, 2024 •

edited

Loading