-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status quo caching extension #53
Status quo caching extension #53
Conversation
I'll have a look at the UnitTests |
Codecov Report
@@ Coverage Diff @@
## master #53 +/- ##
==========================================
- Coverage 37.34% 35.43% -1.91%
==========================================
Files 15 18 +3
Lines 739 807 +68
Branches 110 122 +12
==========================================
+ Hits 276 286 +10
- Misses 449 506 +57
- Partials 14 15 +1
Continue to review full report at Codecov.
|
Hello @tfesenbecker! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-11-08 09:14:17 UTC |
a5d4577
to
b032a0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, great progress! 🥇
General
Please try to attach docstrings to methods and classes. You can also include information on what you are still planning to do, what is not working yet, etc. This helps in reading the new stuff going on here and can improve our feedback for you :)
Comments to Caching Approach itself
It currently seems to me that parts of files can only be added to a storage but never be deleted again.
Importing Storage Information
In general, I am currently not yet convinced to import the storage data as it is done with the other input files. It makes a difference. The other input files are exported from existing tools and have defined formatting. So we define in- and output methods. For the storage it is different, it is for setting up the environment for simulation. So more an in-simulation-configuration.
What I currently also don't know is if we could combine those information with information from configuration data from Tardis (I am planning to also support importing those).
@@ -15,6 +15,7 @@ def __init__( | |||
pool_resources: dict, | |||
scheduling_duration: float, | |||
ignore_resources: list = None, | |||
sitename: str = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please reason about including sitename
into the drone? From my point of view, this might be too specific. I currently also don't see the advantages from having something like sitename
. But I bet you do have some very good reasons! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be careful about using sitename
-- it is a very specific jargon/feature (CMS) and may not properly reflect how we need to model caches. E.g. what we've seen is that for a single "site" there may be different cache/storage requirements -- see Chris' results on benchmarking TSystems. If we need to identify the "site", that is basically the Pool
of the drone -- which can already be used as a dict
key etc.
Is there some advantage to using a string identifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sitename
is just quick workaround for the mapping of pools/drones and storage elements. I want to replace this as soon as I understand what information/means of identification will available once other changes to the overall system (adding more of the jobs ClassAd structure, ...) are done. Passing this information directly to the drone was the option that needed the least changes to existing code.
@maxfischer2781 What do you mean by
which can already be used as a dict key etc.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by ...
A Pool
instance can be used as a key in a mapping -- e.g. Dict[Pool, Storage]
should be possible to map from pool (site) to storage and vice versa.
Which of course isn't of much use to you if you want to define this outside of the simulation, e.g. in a JSON/YAML configuration file...
Co-Authored-By: Eileen Kuehn <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a mixture of new proposals, food for though and change requests. Note that if you feel a change may be worthwhile but want to get basic functionality first, just add a ticket to add it later. The label recommendation is likely such a case.
@@ -15,6 +15,7 @@ def __init__( | |||
pool_resources: dict, | |||
scheduling_duration: float, | |||
ignore_resources: list = None, | |||
sitename: str = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be careful about using sitename
-- it is a very specific jargon/feature (CMS) and may not properly reflect how we need to model caches. E.g. what we've seen is that for a single "site" there may be different cache/storage requirements -- see Chris' results on benchmarking TSystems. If we need to identify the "site", that is basically the Pool
of the drone -- which can already be used as a dict
key etc.
Is there some advantage to using a string identifier?
Last night I thought again about the So I must revert one of my previous comments: This value must stay an value that you set after you now what its value will actually be. I think this means that the run method of the job itself needs extension. So you must have the connection to the What do you think? |
I'm a bit confused about your mention of the |
I decided to change the jobs |
…r, one unit test fails because job has no drone, ticket already exists
…achingextension
Feature/storageimprovement
fixed bug from merge
@eileen-kuehn @tfesenbecker as discussed, I've changed the target branch to |
Current status of the caching extension of lapis.
The new code contains
As this is only a draft the code contains different debug outputs to track the execution of jobs.