-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WALL-E role updates #7
base: main
Are you sure you want to change the base?
Conversation
Oh dear, I just noticed that this overlaps a bit with your existing PR @mira-miracoli 😬 |
Hi, thank you for your contibution! :) |
Hey Mira, yes please feel free to merge your PR first and I'll clean up mine. The |
@neoformit |
Hey @mira-miracoli I've rebased, there were lots of conflicts that I think I've resolved correctly but please check the diff on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks ok now, I'll try running updating the role and running our playbook to check for issues.
while chunk := specimen.read(chunksize): | ||
sha1.update(chunk) | ||
except PermissionError: | ||
logger.warning(f"Permission denied for file: {path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two cases where I caught permission errors here, in my case it was only for one file in the JWD (I think command.sh) but this error could be fatal if all JWD files are PermissionDenied. I guess in that case it would be pretty obvious in walle.log that walle is not working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, in my walle.log
I get:
2024-10-11 04:48 - WARNING - Permission denied for file: /mnt/galaxy/tmp/job_working_directory/_interactive/24546/command.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I think walle should run as root, because the jupyter users have root access inside their jupyter notebook and can save files as uid and gid 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WallE could cleanup everything non-root and the rest we could leave to our normal cleanup scripts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Walle does not clean up, it just scans files. But in order to do so, it needs read access
Tested on the AU dev server and working ok with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry some of the comments are just ideas, not worth changing.
I am not sure how --kill
and --debug
will be used in the script, I don't see a a change in the main function
(You're probably still working on it?)
defaults/main.yml
Outdated
@@ -30,6 +31,13 @@ walle_envs_database: | |||
value: "{{ galaxy_config_dir }}/galaxy.yml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value: "{{ galaxy_config_dir }}/galaxy.yml" | |
value: "{{ galaxy_config_file }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also change (but gh does not allow suggestions on unchanged code parts :/ (or I don't know how)
- - key: PGHOST
- value: 127.0.0.1
- - key: PGUSER
- value: galaxy
- - key: PGDATABASE
- value: galaxy
+ - key: PGHOST
+ value: "{{ galaxy_pg_host }}"
+ - key: PGUSER
+ value: "{{ galaxy_pg_user }}"
+ - key: PGDATABASE
+ value: "{{ galaxy_pg_db }}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
galaxy_pg_host
, galaxy_pg_user
and galaxy_pg_db
seem to be EU-specific playbook vars, we don't have them in AU or in the galaxyproject.galaxy
role? I assumed that admins would change these values with walle_extra_env_vars
if they wanted to customize them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh okay, to me it looked like you added them in the README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep you're right, I did 🤦 I have committed this suggestion: 4ce5886. It does make it easier for admins to control these basic vars, but can still get more flexibility with walle_extra_env_vars
.
while chunk := specimen.read(chunksize): | ||
sha1.update(chunk) | ||
except PermissionError: | ||
logger.warning(f"Permission denied for file: {path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Walle does not clean up, it just scans files. But in order to do so, it needs read access
Thanks a lot for the thorough review @mira-miracoli 🎉 I think we just have to decide how to handle the |
Of course, many thanks for the great contribution! :)
I am not happy with the dictionaries either. Ansible does not allow changing only specific k/v pairs from default, so you need to copy everything and then do your changes.
should work for most Galaxy instances(?) |
Co-authored-by: Mira <[email protected]>
True, but I think they can just override them by appending to walle_envs_database:
- key: PGHOST
value: 127.0.0.1
- key: PGUSER
value: galaxy
walle_extra_env_vars: # Defined in playbook
- key: PGUSER
value: custom_db_user
walle_env_vars: "{{ walle_envs_database + walle_extra_env_vars }}"
# Results in:
walle_env_vars:
- key: PGHOST
value: 127.0.0.1
- key: PGUSER
value: galaxy
- key: PGUSER
value: custom_db_user If you do |
Sorry, didn't mean to close this. I did a typing blooper and hit some shortcut by accident 💩 |
oh smart! I did not think of this, that you could use the |
Yeah for sure! Do you think it's good to also have the required playbook vars that I defined in the README?
|
If they only appear in eu's playbooks, I think we could remove that |
Sorry I know this is very annoying but pyright gave me some errors. |
Closed and reopened, so the pyright action triggers (I don't know a more subtle way 🙄 ) |
That is annoying, it seems that pyright raises linting errors for type hints on dependencies (i.e. we can't add type hinting to |
You may wish to have a discussion about this, but I made quite a few changes to get WALL-E working on Galaxy AU. I think these changes should improve interoperability between Galaxy servers, but would require a minor update to EU's playbook to work (additional vars).
--debug
option forwalle.py
walle_bashrc
walle.py
--kill
option forwalle.py
to kill malicious jobs with gxadmingalay_jwd.py
that accepts XML or YAML format forobject_store_conf
Add debug logging
walle.py --debug
was really useful for getting this going. It's very verbose so probably don't want to leave it on in production.Modify cron job
I found that cron is not able to source our Galaxy
.bashrc
file, probably because it contains code that is specific to an interactive shell. The result is that none of the env vars that are set inwall_bashrc
make it into WALL-E's env. I fixed this by creating a new.bashrc
file for WALL-E (with all required env variables) and sourcing it in crontab like so:Add all required env variables
To enable the dedicated
walle_bashrc
, all required env vars are added by the walle role. This requires additional Ansible variables, which are documented in README.md. Additional env vars can be easily added in the playbook withwalle_extra_env_vars
.Permission error in
walle.py
Instead of failing, log a warning if
PermissionDenied
is raised when reading a JWD file.Add
--kill
option forwalle.py
Optional, of course. We won't use this yet but perhaps in future when we're confident in WALL-E's abilities. We also would like to add an
--alert
option to notify us in Slack when malicious files are detected - this will most likely be our default. The kill option assumes that gxadmin is accessible (can point to it with env varGXADMIN_PATH
), which will be run like:gxadmin mutate fail-job $JOB_ID --commit gxadmin mutate fail-terminal-datasets --commit
Ignore Ansible failure of "Clone malware database (WallE)"
Useful if you want to make a local modification of
checksums.yml
for testing.