-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parametrises column names #269
Conversation
Tidying up doc strings we are switching to the newer format of `str | Path` rather than the older `Union[str, Path]`. This can be supported under Python 3.8 with `from __future__ import annotations` but since Python 3.8 is due to reach End of Life in [2024-10-31](https://endoflife.date/python) there is no harm in dropping support now and setting the minimum version to 3.9 and adding support/testing for 3.11.
+ Documentation on output columns added. + Parametrises columns in `lib/pgfinder/config/columns.yaml` and loads these on initialisation. + Uses the columns defined and loaded in `pgfinder.find_pg` functions for loading ftrs and maxquant files. + Adds a dictionary for mapping and renaming consolidated variables. Abstracting the information out of hard-coded variables into parameter files simplifies the process of extending to future update file formats and makes managing changes (still to come in this PR) easier. Will have to add to documentation how the configuration file works and perhaps also something to change it, although the way things are going I don't think anyone will use this interactively and so this documentation would be for developers (which doesn't make it any less important as the more we can help our future selves the better).
+ Updates `pgfinder/matching.py` to use the parametrised fields names. + Annotates `pgfinder/config/columns.yaml` with comments. + Updates regression test files (column headers are the only thing that has changed) + Updates tests for consolidation (column headers are the only thing that has changed) + Removes comment from test.
Moving columns to a config file has always felt like the ideal solution! Happy to give this a look soon — is there a reason tests are failing still? |
Yes, I figured if I was going to change lots of things in lots of places it would be sensible to abstract it out and parameterise the values so its easier to change and extend in the future. Tests passed locally (for once I checked before pushing and making the PR). I'll have a look tomorrow. |
Seems I was mistaken about the minimum Python version for the newer method of writing type hints.
Haha, just beat me to it! https://peps.python.org/pep-0604/ (minimum version of 3.10) |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #269 +/- ##
==========================================
+ Coverage 70.25% 72.37% +2.12%
==========================================
Files 11 10 -1
Lines 511 496 -15
==========================================
Hits 359 359
+ Misses 152 137 -15 ☔ View full report in Codecov by Sentry. |
@ns-rse Unfortunately looks like some Windows demons to slay — I've tried a couple of times to re-run it, but with no luck. I'm also stuck without a Windows machine to replicate it outside of the CI! After that's ironed out, I'll review! |
The error stems from I don't see any way the changes this PR introduces could cause these errors, the |
6298479
to
4d49244
Compare
@ns-rse Not sure how you'd like to carry on from here — I'd agree it doesn't seem to be the fault of this diff. Would you want to drop the 3.11 testing for Windows from the CI in the meantime? |
@TheLostLambda Tricky, Python and OS versions are defined via matrixes, so changing that makes it fiddly. We could over-ride the merging rules of requiring all tests to pass and write it up as an issue to be addressed in due course perhaps. |
@ns-rse I'm happy enough with that! I'll review this and we can get it merged then! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, with just a few bits of documentation that I could answer questions for, some commented bits of code that should probably be deleted, and a non-essential note or two!
I'll approve if you want to tidy those last little bits and then merge things in!
ascending=[True, False], | ||
key=lambda k: abs(k) if is_numeric_dtype(k) else k, | ||
inplace=True, | ||
kind="stable", | ||
) | ||
group.reset_index(drop=True, inplace=True) | ||
|
||
# The hard coded 'Delta ppm' and 'Intensity column names aren't yet handled dynamically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems somewhat similar with the "Inferred structure ..." and "Intensity ..." ones which are only partially dynamic. If this is a faff to sort out though, then there is no need to worry about it for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't address these as it would have required making pgfinder.input
a dictionary so that the Intensity
column could be referenced and would have required another entry under pgfinder
to hold the value of Delta ppm
.
I'll do those in a separate PR though tomorrow as there is no point in leaving the job half-done. Thanks for picking me up on this.
Thanks for feedback from @TheLostLambda Also ordered pre-commit hooks.
Closes #267
Parametrises the column headers for input files (different versions of
ftrs
andmaxquant
, theoretical masses) and the columns PGFinder renames these to and subsequently derived columns based on these.These values were hard-coded at various places through out the code, this makes updating them awkward and fickle. Instead the values are defined in the file
pgfinder/config/columns.yaml
and this is loaded to a Python dictionary. The reference to these values are then used at various points rather than hard coded values themselves.Required tweaks to...
pgfinder/pgio.py
pgfinder/validation.py
pgfinder/matching.py
...as well as some of the test files which were used for comparison (including the regression tests) where the header columns have changed.
This should in theory make it somewhat easier to...
ftrs
version changes.Other changes
str | Path
rather than the olderUnion[str, Path]
. This can be supported under Python 3.8 withfrom __future__ import annotations
but Python 3.8 is due to reach End of Life in 2024-10-31.