Add parameter to config and gui to set the min precursors required for update #460

anna-charlotte · 2025-01-29T14:56:06Z

Adding the two_step_classifier_min_precursors_for_update parameter to config and gui (experimental)

…rs required to activate logreg clf

…ram-to-gui

mschwoer · 2025-02-27T21:30:58Z

alphadia/fdrx/models/two_step_classifier.py

        logger.info("=== Starting training of TwoStepClassifier ===")

        df = self._preprocess_data(df, x_cols)
-        best_result = None
+        df_train = df[df["rank"] < self._train_on_top_n]


just from the name, I would assume _train_on_top_n is a boolean, but then this line would make no sense .. maybe find a better name?

mschwoer · 2025-02-27T21:31:15Z

alphadia/fdrx/models/two_step_classifier.py

-                df_train = df[df["rank"] < self._train_on_top_n]
-                df_predict = df
+        # train and apply NN classifier
+        self.second_classifier.epochs = 10


deliberately hardcoded?

Yes, but the value was chosen somewhat arbitrarily. The reason I set it to 10 was to avoid the error where BinaryClassifierLegacyNewBatching.fit() crashes in model_selection.train_test_split(x, y, test_size=self.test_size) when there aren’t enough samples in x for splitting. 10 worked for me, but I agree it’s not ideal and as I said, chosen a bot by random. Do you have a suggestion on what do do instead maybe?

at least, make it either a module-wide constant or create a new method parameter and set it as default (then is is more obvious that there is a knob to tune)..
if it makes sense to have the user tune it -> config

oops sorry, I mixed something up here, and was talking about something else. Yes, this one is deliberately set to 10, but I will add a #TODO to line 126 where we set it = 50

mschwoer · 2025-02-27T21:33:54Z

alphadia/fdrx/models/two_step_classifier.py

+    if scale_by_target_decoy_ratio:
+        n_targets = (df["decoy"] == 0).sum()
+        n_decoys = (df["decoy"] == 1).sum()
+        scaling_factor = round(n_targets / n_decoys, 3)


please avoid raising ZeroDivisionError here (either by catching it or by adding a small epsilon to the denominator (not sure how the latter will affect the isfinite check though)

mschwoer · 2025-02-27T21:35:28Z

alphadia/workflow/manager.py

-                            torch.load(os.path.join(path, file), weights_only=False)
-                        )
-                        self.classifier_store[classifier_hash].append(classifier)
+        for file in os.listdir(path):


https://www.stuartellis.name/articles/python-modern-practices/#use-osscandir-instead-of-oslistdir
Would not have expected reading that article yesterday would come in handy so quickly ;-)

add parameter to config and gui to set the minimum number of precurso…

fcbcbbc

…rs required to activate logreg clf

anna-charlotte changed the base branch from main to fix-update-of-first-classifier January 29, 2025 14:56

anna-charlotte marked this pull request as draft January 29, 2025 14:58

Base automatically changed from fix-update-of-first-classifier to main January 30, 2025 21:45

anna-charlotte marked this pull request as ready for review January 31, 2025 07:55

anna-charlotte requested a review from GeorgWa January 31, 2025 07:56

anna-charlotte marked this pull request as draft February 5, 2025 14:38

anna-charlotte added 18 commits February 5, 2025 17:03

remove i greater 0 statement from two-step-classifier

f67756e

add pretrained TwoStepClassifier model

e621da2

add check if n_samples > 10 in BinaryClassifierLegacyNewBatch

16ce82a

2-step-clsf: add i > 0

716be78

remove pretrained model

c8b9463

2-step-clsf version: i >= 0, not pretrained

5fc2fe9

2-step-clsf version: i >= 0, pretrained

0abc255

2-step-clsf version: i >= 0, not pretrained

18fa5ab

2-step-clsf version: i > 0, not pretrained

d93fae3

remove print statements

8df18ea

add scaling of qval

a6ee75d

fix group_column param

1fee785

fix scaling_factor

8662354

fix scaling_factor

82ea90d

reorganize TwoStepClassifier.fit_predict()

d969206

add pretrained model file

6a168a3

remove redundant parameters

25e2aca

move chek for enough samples

76aff18

anna-charlotte requested a review from mschwoer February 27, 2025 20:58

anna-charlotte added 2 commits February 27, 2025 22:00

clean up

033918d

Merge remote-tracking branch 'origin/main' into add-min-precursors-pa…

4ec9c01

…ram-to-gui

mschwoer reviewed Feb 27, 2025

View reviewed changes

adressing pr comments

d978ef4

anna-charlotte added 2 commits February 27, 2025 23:23

fix os.scandir error

42b1c0c

add TODO

1f4841f

anna-charlotte marked this pull request as ready for review February 28, 2025 07:37

anna-charlotte added 2 commits February 28, 2025 08:42

fix typos

927960f

pr comments

f161030

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add parameter to config and gui to set the min precursors required for update #460

Add parameter to config and gui to set the min precursors required for update #460

anna-charlotte commented Jan 29, 2025

mschwoer Feb 27, 2025

mschwoer Feb 27, 2025

anna-charlotte Feb 27, 2025

mschwoer Feb 27, 2025

anna-charlotte Feb 27, 2025

mschwoer Feb 27, 2025

mschwoer Feb 27, 2025

Add parameter to config and gui to set the min precursors required for update #460

Are you sure you want to change the base?

Add parameter to config and gui to set the min precursors required for update #460

Conversation

anna-charlotte commented Jan 29, 2025

mschwoer Feb 27, 2025

Choose a reason for hiding this comment

mschwoer Feb 27, 2025

Choose a reason for hiding this comment

anna-charlotte Feb 27, 2025

Choose a reason for hiding this comment

mschwoer Feb 27, 2025

Choose a reason for hiding this comment

anna-charlotte Feb 27, 2025

Choose a reason for hiding this comment

mschwoer Feb 27, 2025

Choose a reason for hiding this comment

mschwoer Feb 27, 2025

Choose a reason for hiding this comment