Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discovery of valid parameters for OO-syntax #117

Open
FrithiofJensen opened this issue Feb 26, 2020 · 0 comments
Open

Discovery of valid parameters for OO-syntax #117

FrithiofJensen opened this issue Feb 26, 2020 · 0 comments

Comments

@FrithiofJensen
Copy link

FrithiofJensen commented Feb 26, 2020

Hi!

When configuring a Ruffus pipeline using the OO-Syntax, one will encounter issues with passing parameters to Ruffus Task Objects.

Certain types of tasks, such as 'Split()' will refuse a 'pipeline_dir=' parameter that other tasks, such as 'Transform()' will be happy to work with (Is this a bug or deliberate?).

One way to discover which 'variant' parameters are accepted by Tasks is to use the 'inspect' module.

In a Python session one can try to look at the source for Tasks _prepare_<task-type> function, like so:

import inspect
import ruffus
from ruffus import *

def tf(*args, **kwargs):
    print(args, kwargs)

pl = ruffus.Pipeline(name='testing')
task = pl.split(task_func=tf, name='atask', output='stuff')


print(inspect.getsource(task._prepare_split))
    def _prepare_split(self, unnamed_args, named_args):
        """
        Common code for @split and pipeline.split
        """
        self.error_type = ruffus_exceptions.error_task_split
        self._set_action_type(Task._action_task_split)
        self._setup_task_func = Task._split_setup
        self.needs_update_func = self.needs_update_func or needs_update_check_modify_time
        self.job_wrapper = job_wrapper_io_files
        self.job_descriptor = io_files_one_to_many_job_descriptor
        self.single_multi_io = self._one_to_many
        # output is a glob
        self.indeterminate_output = 1

        #
        #   Parse named and unnamed arguments
        #
        self.parsed_args = parse_task_arguments(unnamed_args, named_args,
                                                ["input", "output", "extras"],
                                                self.description_with_args_placeholder)

print(inspect.getsource(task._prepare_transform))
    def _prepare_transform(self, unnamed_args, named_args):
        """
        Common function for pipeline.transform and @transform
        """
        self.error_type = ruffus_exceptions.error_task_transform
        self._set_action_type(Task._action_task_transform)
        self._setup_task_func = Task._transform_setup
        self.needs_update_func = self.needs_update_func or needs_update_check_modify_time
        self.job_wrapper = job_wrapper_io_files
        self.job_descriptor = io_files_job_descriptor
        self.single_multi_io = self._many_to_many

        #   Parse named and unnamed arguments
        self.parsed_args = parse_task_arguments(unnamed_args, named_args,
                                                ["input", "filter", "modify_inputs",
                                                 "output", "extras", "output_dir"],
                                                self.description_with_args_placeholder)


The pattern seems to be that a list of 'permitted parameters' are passed in 'parse_task_arguments' - some of these may be optional, others required.

Some parameters are not explicitly mentioned here but always passed, like 'name' and 'task_func'.

Anyways, Hope this helps someone a little!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant