-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flow
+ Job
magic methods
#369
Conversation
This is great! Thanks very much. I notice from the tests that I assumed that job comparison would depend not only on the UUIDs but also on the specific function arguments. I think this is just relying on the default dataclass Personally, I think having stricter equality is better approach but I am open to other thoughts. |
This is indeed nice. Just to comment or push the discussion a bit further, what "should" be the len of a Flow if it is made of other Flows ? Could be the number of "steps" (jobs or flows) or the number of "flattened" jobs. I think it should be made clear what it is. Same for the print, what happens when you'd print a Flow with Flows in it. I guess it would/should print kind of an indented "structure" of the Flow with its subflows. Maybe it would be nice to have a choice on how "deep" you want to print. Additional thought on the dynamical aspect. The Flow as such only exist at creation time while it will disappear once inserted into the database. A some point it would be nice to be able to somehow have the information of this Flow stored somewhere. And the len of this Flow can evolve in that case (if you have additions, replace, ...). I'm maybe mixing up different issues here but they are somehow related to each other. |
I think those are very good points @davidwaroquiers. To add to that, some comments:
Maybe its ok to leave the implementation as is, but I agree that the behaviour should be clarified. It would be nice to add examples of how to use these methods in the docs somewhere. |
Great comments, thanks @utf @davidwaroquiers.
I considered returning the flattened length of a @property
def flat(self):
... which would recursively unpack subflows. Then people can use
@davidwaroquiers That's already handled actually. If from jobflow import Flow, Job
def add(a, b):
return a + b
def get_test_job():
return Job(add, function_args=(1, 2))
subflow1 = Flow([get_test_job()])
subflow2 = Flow([get_test_job(), get_test_job()])
subsubflow = Flow([get_test_job(), get_test_job()])
subflow3 = Flow([get_test_job(), get_test_job(), subsubflow])
print(Flow([subflow1, subflow2, subflow3])) Flow(name='Flow', uuid='f7ae544a-7e8d-4976-b768-13e431199fa6') consisting of 3 jobs:
1. Flow(name='Flow', uuid='9e7b6459-adca-48b1-8c00-b86e15876a38') consisting of 1 jobs:
1.1. Job(name='add', uuid='0d4a4621-3249-4f94-bdb1-a83910982921')
2. Flow(name='Flow', uuid='d2b1cdbe-0075-4bba-8686-3851fe81a4b0') consisting of 2 jobs:
2.1. Job(name='add', uuid='bf687f6b-3e5b-4e48-b579-e1646b0707c7')
2.2. Job(name='add', uuid='9422bbf9-8c91-4153-a19b-5aad36d96431')
3. Flow(name='Flow', uuid='b62ebaaf-bea2-489f-8e88-75a2eebfa7a9') consisting of 3 jobs:
3.1. Job(name='add', uuid='5fae9b50-8051-438e-b295-14298569c527')
3.2. Job(name='add', uuid='1dbba533-4cc7-4cef-be76-d9d773cfaf80')
3.3. Flow(name='Flow', uuid='b6526c18-ce3c-48c2-ae26-6207859ccb46') consisting of 2 jobs:
3.3.1. Job(name='add', uuid='c0c2eb67-65ab-4bcc-a4cc-465dd8898f2c')
3.3.2. Job(name='add', uuid='42e8f5d2-42d1-4885-a627-13232362447d') I'll have to get back to the other comments later. |
Looking at it now, I guess we can drop the |
Just had time to take another look at this.
|
class Job(MSONable): | |
""" |
iterflow
I actually think it's a plus to have two ways of iterating through a Flow
. As you said @utf, we just need to be clear in the docs on how for job in flow.iterflow
and for job in flow
differ. But I can see use cases for either one. pandas
also has many different ways of iterating though a dataframe (df.iterrows
, df.items
, df.itertuples
, for col in df
, df.keys
and maybe more).
__repr__
@davidwaroquiers The problem with __repr__
kwargs like depth: int
is people usually aren't aware of them. They just print the object or return it from a Jupyter cell in which case Python/ipykernel
implicitly calls __repr__
on that object without passing any arguments. So I don't think there would be much value in a depth
keyword. To use it would mean
print(flow.__repr__(depth=2)
Happy to take further suggestions esp. about __eq__
but for now I'll start adding tests and docs for the implementation as is.
still unfinished
Shallow copy doesn't make sense; jobs aren't allowed to belong to multiple flows
aadb604
to
1cf1b51
Compare
On the So maybe for |
Oh bad timing. I just changed it to jobflow/tests/core/test_job.py Line 430 in 7a19168
|
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #369 +/- ##
=======================================
Coverage 99.86% 99.86%
=======================================
Files 19 19
Lines 1439 1505 +66
Branches 364 374 +10
=======================================
+ Hits 1437 1503 +66
Misses 1 1
Partials 1 1
|
@utf Other than the |
Thanks @janosh. I'm ok with the current I don't mean to be a stickler, but is it possible to add a few more tests so that the test coverage doesn't downgrade. Part of the issue is that you have changed: if typing.TYPE_CHECKING to if TYPE_CHECKING in some cases, and that means the codecov is moaning about not checking those lines. The simple fix is to add the latter to the coverage exception list here: Lines 111 to 115 in 5602cfd
But there are a couple of |
On it! |
@utf Coverage is back up to 100%. 🎉 |
Brilliant, thank you! |
Closes #368.