python3 -m venv .venv && source .venv/bin/activate
pip3 install -r requirements.txt
Run the command below to validate when the data is ready-to-use. If you do not get any error message, you are all set! If error occurs, please refer to the error message to resolve the issue.
python3 run.py --format {dataset_format} \
--task {task type} \
--storage_type {storage_type} \
--yaml_path {data.yaml file path(only required in yolo format)} \
--train_dir {train foler path} --valid_dir {validation folder path} \
--test_dir {test folder path} \
--output_dir {output path} \
--id2label {id2label.json file path(only required in unet format)}
Argument | Values | Description | Required |
---|---|---|---|
--format | yolo / voc / coco / unet / imagenet | format of the dataset | O |
--task | image_classification / object_detection / semantic_segmentation | task of the dataset | O |
--storage_type | s3 / local | storage type of the dataset | O |
--yaml_path | ex) ./coco_dataset/data.yaml | yaml file path for yolo format dataset | only when using yolo format |
--train_dir | ex) ./coco_dataset/train | train dataset folder path | O |
--test_dir | ex) ./coco_dataset/test | test dataset folder path | optional |
--valid_dir | ex) ./coco_dataset/val | valid dataset folder path | optional |
--output_dir | ex) ./validation_dataset | output folder where validation results will be stored | optional |
--id2label_path | ex) ./unet_dataset/id2label.json | id2label file path for unet format dataset | only when using unet format |
python3 run.py --format {dataset_format} \
--task {task type} \
--storage_type {storage_type} \
--dataset_root_path {dataset root folder path} \
--yaml_path {data.yaml file path(only required in yolo format)} \
--id2label {id2label.json file path(only required in unet format)} \
--server_info_path {server_info_netspresso.json file path}
arguments:
Argument | Values | Description | Required |
---|---|---|---|
--format | yolo / voc / coco / unet / imagenet | format of the dataset | O |
--task | image_classification / object_detection / semantic_segmentation | task of the dataset | O |
--storage_type | s3 / local | storage type of the dataset | O |
--dataset_root_path | ex) ./coco_dataset/ | dataset root path | O |
--yaml_path | ex) ./coco_dataset/data.yaml | yaml file path for yolo format dataset | only when using yolo format |
--id2label_path | ex) ./unet_dataset/id2label.json | id2label file path for unet format dataset | only when using unet format |
--server_info_path | ex) ./netspresso/server_info_netspresso.json | A server information file which is generated upon registering a personal training server | O |
certification.np contains certification and file information which created together. Make sure that the files are not changed. If any file changes, the certification will not work and may not enable you to upload dataset to NetsPresso.
netspresso@netspresso:~/NetsPresso-ModelSearch-Dataset-Validator$ python3 run.py --format yolo --yaml_path ./data.yaml --train_dir ./dataset/train --valid_dir ./dataset/val --test_dir ./dataset/test --output_dir ./output
Start dataset validation.
Validation completed! Now try your dataset on NetsPresso!
Five(or Four) files will be created at the selected output path.
train.zip
val.zip
test.zip
certification.np
data.yaml
Two files will be created as shown below at the dataset root path.
certification.np
data.yaml
For more detail, please see [Validation check list][validationchecklist]
In case of validation fail with traceback, please read exception error message.
netspresso@netspresso:~/NetsPresso-ModelSearch-Dataset-Validator$ python3 run.py --format yolo --task object_detection --yaml_path ./data.yaml --train_dir ./dataset/train --valid_dir ./dataset/val --test_dir ./dataset/test --output_dir ./output
Start dataset validation.
Traceback (most recent call last):
File "run.py", line 13, in <module>
validate(dir_path, num_classes, dataset_type)
File "/hdd1/home/NetsPresso-ModelSearch-Dataset-Validator/src/utils.py", line 307, in validate
yaml_label, errors = validate_data_yaml(dir_path, num_classes, errors)
File "/hdd1/home/NetsPresso-ModelSearch-Dataset-Validator/src/utils.py", line 199, in validate_data_yaml
raise YamlException("There is no 'names' in data.yaml.")
src.exceptions.YamlException: There is no 'names' in data.yaml.
In case of validation fail with Validation error, please check 'validation_result.txt'., please check validation_result.txt file to resolve failure.
netspresso@netspresso:~/NetsPresso-ModelSearch-Dataset-Validator$ python3 run.py --format yolo --task object_detection --yaml_path ./data.yaml --train_dir ./dataset/train --valid_dir ./dataset/val --test_dir ./dataset/test --output_dir ./output
Start dataset validation.
Validation error, please check 'validation_result.txt'.
And contents of 'validation_result.txt are like below.
There is no image file for annotation file 'yolo/train/labels/000000000025.txt'
There is no image file for annotation file 'yolo/test/labels/000000000337.txt'
Please read this link: Dataset Structure