-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] allow for all kwargs when using @log_step #552
Comments
I took a brief look, but the only thing I could imagine failing is the |
yes |
this currently is what i've implements if shape_delta:
if len(args)==0 and len(kwargs)>0:
_kwargs = {**kwargs}
first_arg= next(iter(_kwargs))
if isinstance(_kwargs[first_arg],pd.DataFrame):
args=args+(_kwargs.pop[first_arg],)
old_shape = args[0].shape
tic = dt.datetime.now() |
If you think this is a good direction I'll be happy to submit a PR |
I'm not sure if changing Maybe something like this?
|
@MBrouns should I provide a PR? |
It may be an overkill, but a way to do it is to allow to specify the index or name of the argument that is the dataframe to track and then use Example assuming the new argument is called if shape_delta:
func_args = (
inspect.signature(func).bind(*args, **kwargs).arguments # type: ignore
)
if isinstance(arg_to_track, int) and arg_to_track >= 0:
_, input_frame = tuple(func_args.items())[arg_to_track]
elif isinstance(arg_to_track, str):
input_frame = func_args[arg_to_track]
else:
raise ValueError("arg_to_track should be a string or a positive integer")
old_shape = input_frame .shape
...
result = func(*args, **kwargs)
... Remark that both getters could result in errors if
Edit: commenting and expanding on this approach (*) The snippet for v in kwargs.values:
if isinstance(v, pd.DataFrame):
old_shape = v.shape
break is assuming the the first dataframe passed is the one we want to track, but it could be terribly wrong if a function takes more than one dataframe as input (e.g. if one wants to do a merge inside it) |
Hi,
When using
@log_step
in debugging a Pandas Pipeline, the current function must accept a single argument ofdf:pd.Dataframe
.However if the user sends all the parameters as kwargs there is an error .
It would be useful if the
@log_step
will check the first kwargs and if it is apd.Dataframe
then it will convert it into an arg - possible implementation before running the def wrapper() as followsThe text was updated successfully, but these errors were encountered: