Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispatcher batchSize support #73

Open
Rapheus opened this issue Mar 19, 2024 · 3 comments
Open

Dispatcher batchSize support #73

Rapheus opened this issue Mar 19, 2024 · 3 comments

Comments

@Rapheus
Copy link

Rapheus commented Mar 19, 2024

Hello there,

Here's my issue and some context

I have a renderfarm with only a couple of workers.
When I dispatch a script to Deadline, i want that node to render all the frames at once, to save on resources (Opening and closing Gaffer). I use the batchSize to do so, like so:
python_mav0WAV3D1

When Execute using Gaffer's LocalDispatcher, everything works as expected, all frames are written within a single process.

However, if I dispatch using GafferDeadline's Dispatcher, the task completes too quickly due to an early Process Exit Code 0 and only 41 frames out of the 100s are written on a network drive.
mstsc_Adyzk1QWE2

I'm using Gaffer 1.3.1.0 and Deadline 10.3.
I would like to know if GafferDeadline extension and/or Deadline support the Gaffer's batchSize option

Thanks a lot for your help

@ericmehl
Copy link
Member

Hi @Rapheus,
GafferDeadline does support the batchSize option, and it looks like it's getting set correctly on your Deadline job. If it wasn't you see 100 tasks for your job each with a single frame instead of the single task rendering frames 1-100 as you have.

It's odd that it renders some frames but not all of them. I tested this setup here with this script, and it worked as expected :

import Gaffer
import GafferDispatch
import GafferImage
import imath

Gaffer.Metadata.registerValue( parent, "serialiser:milestoneVersion", 1, persistent=False )
Gaffer.Metadata.registerValue( parent, "serialiser:majorVersion", 3, persistent=False )
Gaffer.Metadata.registerValue( parent, "serialiser:minorVersion", 1, persistent=False )
Gaffer.Metadata.registerValue( parent, "serialiser:patchVersion", 0, persistent=False )

__children = {}

__children["ImageWriter"] = GafferImage.ImageWriter( "ImageWriter" )
parent.addChild( __children["ImageWriter"] )
__children["ImageWriter"].addChild( Gaffer.V2fPlug( "__uiPosition", defaultValue = imath.V2f( 0, 0 ), flags = Gaffer.Plug.Flags.Default | Gaffer.Plug.Flags.Dynamic, ) )
__children["Checkerboard"] = GafferImage.Checkerboard( "Checkerboard" )
parent.addChild( __children["Checkerboard"] )
__children["Checkerboard"].addChild( Gaffer.V2fPlug( "__uiPosition", defaultValue = imath.V2f( 0, 0 ), flags = Gaffer.Plug.Flags.Default | Gaffer.Plug.Flags.Dynamic, ) )
__children["ImageWriter"]["dispatcher"]["batchSize"].setValue( 100 )
__children["ImageWriter"]["dispatcher"]["deadline"]["pool"].setValue( 'none' )
__children["ImageWriter"]["dispatcher"]["deadline"]["secondaryPool"].setValue( 'none' )
__children["ImageWriter"]["dispatcher"]["deadline"]["group"].setValue( 'workstation-fast' )
__children["ImageWriter"]["dispatcher"]["deadline"]["onJobComplete"].setValue( 'Nothing' )
__children["ImageWriter"]["dispatcher"]["deadline"]["dependencyMode"].setValue( 'Auto' )
__children["ImageWriter"]["in"].setInput( __children["Checkerboard"]["out"] )
__children["ImageWriter"]["fileName"].setValue( '${HOME}/Desktop/checker_####.exr' )
__children["ImageWriter"]["__uiPosition"].setValue( imath.V2f( 1.0999999, 2.39999986 ) )
__children["Checkerboard"]["size"]["y"].setInput( __children["Checkerboard"]["size"]["x"] )
__children["Checkerboard"]["__uiPosition"].setValue( imath.V2f( 2.5999999, 10.5640621 ) )


del __children

Can you try pasting that into a new Gaffer script and see how it works? If that's different from the script you are testing, can you share your Gaffer script?

It looks like you are launching your own gaffer.bat wrapper, what is that batch script doing? Can you try with pointing Deadline's Gaffer plugin at the bin/gaffer.cmd launcher directly?

Is there a chance you're running out of disk space, the network or the network drive isn't stable or some other limitation that would prevent all the frames from being saved? I would expect it to error in that case but perhaps something is not triggering an error when it should be.

It might also be worth trying a newer Gaffer version, there have been a number of changes since 1.3.1.0 and I generally test GafferDeadline against the latest Gaffer version.

@Rapheus
Copy link
Author

Rapheus commented Mar 20, 2024

Thanks for your answer,

I will try your suggestions and report back shortly

@Rapheus
Copy link
Author

Rapheus commented Mar 22, 2024

Hey !

Running the original gaffer.cmd file on the farm makes the job work as expected.

The gaffer.bat file that I was using is part of a custom launcher that runs apps through a python process (using subprocess.Popen).
It turns out I wasn't capturing the stdout of the process correctly so it would return an early exit code 0.

Thanks a lot for your help,
And thanks for your work on the plugin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants