Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass explict filenames, not subsets, for training #263

Open
JimCircadian opened this issue May 22, 2024 · 3 comments
Open

Pass explict filenames, not subsets, for training #263

JimCircadian opened this issue May 22, 2024 · 3 comments
Assignees
Labels
good first issue Good for newcomers
Milestone

Comments

@JimCircadian
Copy link
Member

  • IceNet version: 0.2.9

Description

Running an icenet_train command, you don't pass the filename but rather the identifier for the dataset. This seems counter intuitive and I'm not entirely sure what it offers, ideally we should pass a filename. Thoughts @bnubald?

What I Did

$ icenet_train_tensorflow -nw -e 1 dataset_config.miniscule_bas_south.json local_test1 42

  File "/home/jambyr/icenet2/icenet/icenet/data/dataset.py", line 77, in __init__
    self._load_configuration(configuration_path)
  File "/home/jambyr/icenet2/icenet/icenet/data/dataset.py", line 130, in _load_configuration
    raise OSError("{} not found".format(path))
OSError: dataset_config.dataset_config.miniscule_bas_south.json.json not found

Note I'm using a dev branch so line numbers are awry, but this applies to the 0.2 release branches. We should change this and the equivalent of MergedIceNetDataSet

dataset = IceNetDataSet("dataset_config.{}.json".format(args.dataset),
                                batch_size=args.batch_size,
                                shuffling=args.shuffle_train)
@JimCircadian JimCircadian added the good first issue Good for newcomers label May 22, 2024
@JimCircadian
Copy link
Member Author

Will require the pipeline changing as well. Leaving here as a record for someone to address, might be worth checking for other instances of this type of behaviour.

@bnubald
Copy link
Collaborator

bnubald commented May 22, 2024

@JimCircadian, yes, is counter-intuitive, but assume would like to leave legacy option?

E.g. detect if file extension is json, and if it isn't, use existing approach, else, use filename.

@JimCircadian
Copy link
Member Author

I like your style @bnubald, I'll stick this in to my existing work stream 😄

@JimCircadian JimCircadian self-assigned this May 23, 2024
JimCircadian added a commit to JimCircadian/icenet that referenced this issue May 23, 2024
…ly qualified dataset filenames and Dev icenet-ai#252: refactoring of existing training functionality to allow extension to use horovod for fully distributed training as a child implementation of the original tensorflow
@JimCircadian JimCircadian added this to the v0.2.9 milestone May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants