-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status quo caching extension #53
Merged
tfesenbecker
merged 36 commits into
MatterMiners:feature/caching
from
tfesenbecker:cachingextension
Nov 19, 2019
Merged
Changes from 17 commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
9215d64
added storage object to represent caches and associated readout funct…
tfesenbecker 0a46da7
extended CLI to support storage files
tfesenbecker e727b69
extended simulator to support storage files
tfesenbecker d64954d
added new drone attribute sitename connecting drones and storage elem…
tfesenbecker 5597434
added file provider object connecting storage objects and jobs
tfesenbecker bb1dcbe
added different caching/cache cleaning/walltime recalculation algorithms
tfesenbecker 8d6db96
renamed storage readout
tfesenbecker fb150db
fixed debug output
tfesenbecker 69072ae
renamed storage input reader
tfesenbecker 53ebec5
updated Job class
tfesenbecker f997223
replaced function modifying walltime by function with property decorator
tfesenbecker 2e2c06f
Revert "replaced function modifying walltime by function with propert…
tfesenbecker 110b3e9
replaced function modifying walltime by function with property decorator
tfesenbecker b032a0d
resolving PEP8 issues
tfesenbecker 7753d0d
Merge branch 'master' of https://github.com/MatterMiners/lapis into c…
tfesenbecker 5123034
fixed file provider bug (wrong inputfiles dictionary)
tfesenbecker 1c2fe9f
Update lapis/cli/simulate.py
tfesenbecker 8739ce9
renamed function get_used_storage to _calculate_used_storage
tfesenbecker 0b5a922
Merge branch 'cachingextension' of https://github.com/tfesenbecker/la…
tfesenbecker 855242a
attached fileprovider to drone instead of job and passed it via make_…
tfesenbecker bfadacb
reworked file coverage function to return a score
tfesenbecker 3f30c58
added proper __repr__ function
tfesenbecker 2b214aa
added file classes
tfesenbecker 2bd91d7
moved caching algorithm and associated cache cleanup to it's own class
tfesenbecker 29576eb
Redesign of the storage class and associated changes
tfesenbecker 146fbe3
put walltime getter and walltime recalculation back in seperate methods
tfesenbecker 7ef8dd9
added parallel treatment of jobs input files in file provider
tfesenbecker b94ab82
fixed failed unit test that were caused by Drone without file provide…
tfesenbecker 1e9e795
Merge branch 'master' of https://github.com/MatterMiners/lapis into c…
tfesenbecker 191df2b
changed scoring to take filesizes into consideration
tfesenbecker a635318
Merge branch 'cachingextension' into feature/storageimprovement
tfesenbecker 6f7ace1
Merge pull request #1 from tfesenbecker/feature/storageimprovement
tfesenbecker 75165ad
fixed bug from merge
tfesenbecker d943ed6
Merge branch 'cachingextension' of https://github.com/tfesenbecker/la…
tfesenbecker 9453632
Merge pull request #2 from tfesenbecker/feature/storageimprovement
tfesenbecker 32faa38
removed debug output to fix unit test
tfesenbecker File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
from lapis.storage import Storage | ||
|
||
|
||
class FileProvider(object): | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
__slots__ = ("storages",) | ||
|
||
def __init__(self): | ||
self.storages = dict() | ||
|
||
def add_storage_element(self, storage_element: Storage): | ||
try: | ||
self.storages[storage_element.sitename].append(storage_element) | ||
except KeyError: | ||
self.storages[storage_element.sitename] = [storage_element] | ||
|
||
def provides_all_files(self, job): | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
Dummy implementation, to be replaced: if a part of every inputfile of the job is | ||
provided by a storage element located on the same site as the drone the job | ||
is running on this function returns True | ||
:param job: | ||
:return: | ||
""" | ||
provided_storages = self.storages.get(job.drone.sitename, None) | ||
eileen-kuehn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if provided_storages: | ||
for inputfilename, inputfilespecs in job.used_inputfiles.items(): | ||
provides_inputfile = 0 | ||
for storage in provided_storages: | ||
provides_inputfile += storage.provides_file( | ||
(inputfilename, inputfilespecs) | ||
) | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if not provides_inputfile: | ||
return False | ||
return True | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
else: | ||
return False |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
from usim import time | ||
|
||
from typing import Optional | ||
|
||
from lapis.utilities.cache_algorithm_implementations import cache_algorithm | ||
from lapis.utilities.cache_cleanup_implementations import cache_cleanup | ||
|
||
|
||
class Storage(object): | ||
|
||
__slots__ = ("name", "sitename", "storagesize", "usedstorage", "content") | ||
|
||
def __init__( | ||
self, name: str, sitename: str, storagesize: int, content: Optional[dict] = None | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
): | ||
self.name = name | ||
maxfischer2781 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.sitename = sitename | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.storagesize = storagesize | ||
self.content = content | ||
self.usedstorage = self.get_used_storage() | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.describe_state() | ||
|
||
def get_used_storage(self): | ||
return sum(subdict["usedsize"] for subdict in self.content.values()) | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def free_space(self): | ||
return self.storagesize - self.usedstorage | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def place_new_file(self, filerequest: tuple): | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
filename, filespecs = filerequest | ||
if self.free_space() - filespecs["usedsize"] < 0: | ||
self.make_room(self.free_space() - filespecs["usedsize"]) | ||
self.content.update({filename: filespecs}) | ||
self.content[filename].update( | ||
cachedsince=time.now, lastaccessed=time.now, numberofaccesses=0 | ||
) | ||
self.usedstorage = self.get_used_storage() | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def update_file(self, filerequest: tuple): | ||
filename, filespecs = filerequest | ||
requested_file = filename | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
filesize_difference = ( | ||
filespecs["usedsize"] - self.content[requested_file]["usedsize"] | ||
) | ||
if filesize_difference > 0: | ||
self.make_room(filesize_difference) | ||
self.content[requested_file]["usedsize"] += filesize_difference | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.content[requested_file]["lastaccessed"] = time.now | ||
self.content[requested_file]["numberofaccesses"] += 1 | ||
self.usedstorage = self.get_used_storage() | ||
|
||
def make_room(self, filesize_difference: int): | ||
if self.free_space() - filesize_difference < 0: | ||
cache_cleanup["fifo"](filesize_difference, self) | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
maxfischer2781 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def provides_file(self, filerequest: dict): | ||
filename, filespecs = filerequest | ||
if filename in self.content.keys(): | ||
self.update_file(filerequest) | ||
return True | ||
else: | ||
if self.cache_file(): | ||
self.place_new_file(filerequest) | ||
return False | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def cache_file(self): | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# cache everything, test different implementations | ||
return cache_algorithm["standard"]() | ||
|
||
def describe_state(self): | ||
print( | ||
"{name} on site {site}: {used}MB of {tot}MB used ({div} %), contains " | ||
"files {filelist}".format( | ||
name=self.name, | ||
site=self.sitename, | ||
used=self.usedstorage, | ||
tot=self.storagesize, | ||
div=100.0 * self.usedstorage / self.storagesize, | ||
filelist=", ".join(self.content.keys()), | ||
) | ||
) | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
import csv | ||
from lapis.storage import Storage | ||
|
||
|
||
def storage_reader(storage, storage_content): | ||
storage_content = storage_content_reader(storage_content) | ||
reader = csv.DictReader(storage, delimiter=" ", quotechar="'") | ||
for row in reader: | ||
yield Storage( | ||
name=row["name"], | ||
sitename=row["sitename"], | ||
storagesize=int(row["cachesizeMB"]), | ||
content=storage_content[row["name"]], | ||
) | ||
|
||
|
||
def storage_content_reader(file_name): | ||
reader = csv.DictReader(file_name, delimiter=" ", quotechar="'") | ||
cache_information = dict() | ||
for row in reader: | ||
if row["cachename"] not in cache_information.keys(): | ||
cache_information[row["cachename"]] = dict() | ||
cache_information[row["cachename"]][row["filename"]] = dict() | ||
for key in [ | ||
"filesize", | ||
"usedsize", | ||
"cachedsince", | ||
"lastaccessed", | ||
"numberofaccesses", | ||
]: | ||
cache_information[row["cachename"]][row["filename"]][key] = int(row[key]) | ||
if not cache_information: | ||
cache_information = None | ||
return cache_information |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
def cache_all(): | ||
tfesenbecker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return True | ||
|
||
|
||
cache_algorithm = {"standard": cache_all} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please reason about including
sitename
into the drone? From my point of view, this might be too specific. I currently also don't see the advantages from having something likesitename
. But I bet you do have some very good reasons! :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be careful about using
sitename
-- it is a very specific jargon/feature (CMS) and may not properly reflect how we need to model caches. E.g. what we've seen is that for a single "site" there may be different cache/storage requirements -- see Chris' results on benchmarking TSystems. If we need to identify the "site", that is basically thePool
of the drone -- which can already be used as adict
key etc.Is there some advantage to using a string identifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sitename
is just quick workaround for the mapping of pools/drones and storage elements. I want to replace this as soon as I understand what information/means of identification will available once other changes to the overall system (adding more of the jobs ClassAd structure, ...) are done. Passing this information directly to the drone was the option that needed the least changes to existing code.@maxfischer2781 What do you mean by
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A
Pool
instance can be used as a key in a mapping -- e.g.Dict[Pool, Storage]
should be possible to map from pool (site) to storage and vice versa.Which of course isn't of much use to you if you want to define this outside of the simulation, e.g. in a JSON/YAML configuration file...