Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Flags --remove_keys_from_pickles #475

Merged
merged 2 commits into from
Jan 20, 2025
Merged

Conversation

Qrouger
Copy link
Contributor

@Qrouger Qrouger commented Jan 8, 2025

User can choose if some keys of pickle file are delete or not

@jkosinski jkosinski requested a review from DimaMolod January 8, 2025 12:13
@jkosinski jkosinski added the enhancement New feature or request label Jan 8, 2025
@jkosinski
Copy link
Collaborator

Hi Quentin, thanks for your contribution! It may take some time before we review your pull request because @DimaMolod is away due to unforeseen events.

@DimaMolod
Copy link
Collaborator

DimaMolod commented Jan 16, 2025

Hi @Qrouger and thank you for your contributions to AlphaPulldown! Regarding the --remove_keys_from_pickles flag, I think it is already available and can be used with run_multimer_jobs.py because it is defined in run_structure_prediction.py
https://github.com/KosinskiLab/AlphaPulldown/blob/main/alphapulldown/scripts/run_structure_prediction.py#L144-L145
and all flags from run_structure_prediction.py are imported into run_multimer_jobs.py upon initialisation:
https://github.com/KosinskiLab/AlphaPulldown/blob/main/alphapulldown/scripts/run_multimer_jobs.py#L14
Could you give it a try as is and let me know if it works for you?
Many thanks!

@Qrouger
Copy link
Contributor Author

Qrouger commented Jan 16, 2025

Hi, I tried to use --remove_keys_from_pickles=False on run_multimer_jobs.py, but ['aligned_confidence_probs', 'distogram', 'masked_msa'] from pkl file are already remove !

@DimaMolod
Copy link
Collaborator

Hi, I tried to use --remove_keys_from_pickles=False on run_multimer_jobs.py, but ['aligned_confidence_probs', 'distogram', 'masked_msa'] from pkl file are already remove !

Hi, yes, I think I know where from the problem originates. I think the reason is the very counter-intuitive way of dealing with boolean flags in absl library.
If you pass a flag like --remove_keys_from_pickles=False it will still treat it as True. To make it false you have to use --no<your_flag>, in our case --noremove_keys_from_pickles
citing https://abseil.io/docs/python/guides/flags

DEFINE_bool or DEFINE_boolean: typically does not take an argument: pass --myflag to set FLAGS.myflag to True, or --nomyflag to set FLAGS.myflag to False. --myflag=true and --myflag=false are also supported, but not recommended.

@Qrouger
Copy link
Contributor Author

Qrouger commented Jan 16, 2025

--noremove_keys_from_pickles don't work too

@DimaMolod
Copy link
Collaborator

--noremove_keys_from_pickles don't work too

Thanks for reporting this. I indeed didn't test before that the keys are present after post_prediction_process() if --remove_keys_from_pickles=false. I added these asserts now here:
https://github.com/KosinskiLab/AlphaPulldown/blob/main/test/test_post_prediction.py#L77-L78
still, it seems to work fine for me.
Could you please specify the AP version/pkl files/all other data that you used so that I could reproduce the error?

@Qrouger
Copy link
Contributor Author

Qrouger commented Jan 17, 2025

I use AP 2.0.1, with this command run_multimer_jobs.py --mode=custom --num_cycle=3 --num_predictions_per_model=1 --compress_result_pickles=True --output_path=./result_all_vs_all --data_dir=/data/alphadata_v2 --protein_lists=all_vs_all.txt --monomer_objects_dir=/data/pickle_feature --remove_keys_from_pickles=false, for this interaction: P09783;P0A3W6
test_pickle_keys.zip

I0117 14:48:56.840612 134591756057472 post_modelling.py:65] Identified best pickle file: result_model_1_multimer_v3_pred_0.pkl
I0117 14:48:56.840648 134591756057472 post_modelling.py:68] Removing specified keys from all pickle files in ./result_all_vs_all/P09783_and_P0A3W6
I0117 14:48:56.840716 134591756057472 post_modelling.py:33] Removing keys ['aligned_confidence_probs', 'distogram', 'masked_msa'] from pickle file: ./result_all_vs_all/P09783_and_P0A3W6/result_model_1_multimer_v3_pred_0.pkl
I0117 14:48:56.928484 134591756057472 post_modelling.py:41] Removing key: aligned_confidence_probs
I0117 14:48:56.933539 134591756057472 post_modelling.py:41] Removing key: distogram
I0117 14:48:56.939410 134591756057472 post_modelling.py:41] Removing key: masked_msa
I0117 14:48:56.942368 134591756057472 post_modelling.py:48] Keys removed and file updated: ./result_all_vs_all/P09783_and_P0A3W6/result_model_1_multimer_v3_pred_0.pkl
I0117 14:48:56.942414 134591756057472 post_modelling.py:33] Removing keys ['aligned_confidence_probs', 'distogram', 'masked_msa'] from pickle file: ./result_all_vs_all/P09783_and_P0A3W6/result_model_2_multimer_v3_pred_0.pkl
I0117 14:48:57.029560 134591756057472 post_modelling.py:41] Removing key: aligned_confidence_probs
I0117 14:48:57.034631 134591756057472 post_modelling.py:41] Removing key: distogram
I0117 14:48:57.040381 134591756057472 post_modelling.py:41] Removing key: masked_msa
I0117 14:48:57.043423 134591756057472 post_modelling.py:48] Keys removed and file updated: ./result_all_vs_all/P09783_and_P0A3W6/result_model_2_multimer_v3_pred_0.pkl
I0117 14:48:57.043466 134591756057472 post_modelling.py:33] Removing keys ['aligned_confidence_probs', 'distogram', 'masked_msa'] from pickle file: ./result_all_vs_all/P09783_and_P0A3W6/result_model_3_multimer_v3_pred_0.pkl
I0117 14:48:57.130895 134591756057472 post_modelling.py:41] Removing key: aligned_confidence_probs
I0117 14:48:57.136015 134591756057472 post_modelling.py:41] Removing key: distogram
I0117 14:48:57.141834 134591756057472 post_modelling.py:41] Removing key: masked_msa
I0117 14:48:57.144948 134591756057472 post_modelling.py:48] Keys removed and file updated: ./result_all_vs_all/P09783_and_P0A3W6/result_model_3_multimer_v3_pred_0.pkl
I0117 14:48:57.144990 134591756057472 post_modelling.py:33] Removing keys ['aligned_confidence_probs', 'distogram', 'masked_msa'] from pickle file: ./result_all_vs_all/P09783_and_P0A3W6/result_model_4_multimer_v3_pred_0.pkl
I0117 14:48:57.231919 134591756057472 post_modelling.py:41] Removing key: aligned_confidence_probs
I0117 14:48:57.237420 134591756057472 post_modelling.py:41] Removing key: distogram
I0117 14:48:57.243092 134591756057472 post_modelling.py:41] Removing key: masked_msa
I0117 14:48:57.246319 134591756057472 post_modelling.py:48] Keys removed and file updated: ./result_all_vs_all/P09783_and_P0A3W6/result_model_4_multimer_v3_pred_0.pkl
I0117 14:48:57.246363 134591756057472 post_modelling.py:33] Removing keys ['aligned_confidence_probs', 'distogram', 'masked_msa'] from pickle file: ./result_all_vs_all/P09783_and_P0A3W6/result_model_5_multimer_v3_pred_0.pkl
I0117 14:48:57.332417 134591756057472 post_modelling.py:41] Removing key: aligned_confidence_probs
I0117 14:48:57.337394 134591756057472 post_modelling.py:41] Removing key: distogram
I0117 14:48:57.342532 134591756057472 post_modelling.py:41] Removing key: masked_msa
I0117 14:48:57.345549 134591756057472 post_modelling.py:48] Keys removed and file updated: ./result_all_vs_all/P09783_and_P0A3W6/result_model_5_multimer_v3_pred_0.pkl

@DimaMolod
Copy link
Collaborator

Hi @Qrouger and thanks again for spotting this!
I forgot to explicitly add the --remove_keys_from_pickles key to the list of arguments for the run_structure_prediction command, therefore is was always set to the default value (True). It should be fixed now with the latest commit.
Could you please update AP and confirm it works for you?

@DimaMolod
Copy link
Collaborator

Ok, I've just realised that my fix is identical to yours that you suggested 2 weeks ago :-) I'll merge the PR right away and thanks again for your contribution

@DimaMolod DimaMolod merged commit 6c43bac into KosinskiLab:main Jan 20, 2025
4 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants