Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP, PoC: Add pyautogui workflow #308

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

melissawm
Copy link
Member

References and relevant issues

Addresses #289

Description

This is a proof of concept for autogeneration of videos for the docs. There are probably several ways in which this could be improved/changed but I wanted to get some early feedback on the general workflow. Specifically:

  • Is this worth doing? We won't have many such scripts right now, but long term they will probably pay off in terms of detecting GUI changes and maintenance.
  • This script only performs the actions, but the video needs to be captured by some other tool, as far as I can tell (pyautogui only takes screenshots, not video). I experimented with obs but any other screen recording tool would work. I'd recommend writing docs on the appropriate standard settings so we have consistency over the recorded videos. Note that this workflow also helps with that as the napari window size, theme and settings would all be the same across videos.
  • I've had to force the events processing twice for each action because otherwise I would only see the events when the next one was triggered - I did some research and this could maybe be related to some async processing? Unfortunately I don't that much about qt to debug it. Any help is appreciated!

@melissawm melissawm marked this pull request as draft December 19, 2023 13:42
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 19, 2023
Co-authored-by: Daniel Althviz Moré <[email protected]>
@jni
Copy link
Member

jni commented Dec 20, 2023

Ah! This is failing for me, I think because my screen is scaled 2x and locateCenterOnScreen is not scale-invariant. 😬 After a quick browse of the PyAutoGUI docs I don't see any support for image scaling, so I suspect we might need to use locate directly after processing the images with screen scaling. We can potentially even be faster by getting a screenshot of the viewer, grabbing the coordinates within that screenshot, then transforming to screen space.

ANYWAY. This is SO. COOL. It's actually not at all what I was thinking (I thought all the code would be model code and we would just forgo "clicks", just see things happening in the viewer and on screen), but in a lot of ways it's better because of the clicks.

Other ideas:

  • grab the button images from the napari repo instead of saved screenshots. That way if we change the icons the scripts still work! 🤯
  • It would be really good if the messages could be integrated with the screen capture method to add captions to the video. Amazing, really. 😃
  • set the image paths using pathlib.Path('__file__').parent / '../../images/... so that you can run the script from any directory.

@melissawm
Copy link
Member Author

Reviving this after a long time, but after talking to @villares he pointed out this approach: https://www.shedloadofcode.com/blog/record-mouse-and-keyboard-for-automation-scripts-with-python

The idea is that instead of writing every interaction manually in these scripts (which honestly would be a HUGE PAIN) we could record these interactions and create scripts automatically. I haven't tried this yet, but I will!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

2 participants