-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SBR calculation pipeline #1
Conversation
- Calculates SBR value and Flex Archetype given a set of user inputs - Calculations based on first set of excel calculations written by ESC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
Could the README be updated to explain how to use the library & what it does?
e.g. inputs, available functions, outputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work but some things to think about!
GOLD_FLEXER = "Gold Standard Flexer" | ||
|
||
|
||
def calc_flex_archetype(user_inputs: UserInputs, sbr_val: float) -> FlexArchetypes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to consider here would be to instantiate the UserInputs
dataclass for each flex archetype and then check for equality (which dataclasses handle out of the box)...
E.g.
no_flexer = UserInputs(True, True, False, True ...)
if user_inputs == no_flexer:
return FlexArchetypes.NO_FLEXER.value
You could also then use a match ... case
statement to simplify further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative implementation to Gus's suggestion - abstract each 'if conditions' into a separate function:
- This long if-conditions are very long and hard to read and going to be hard to unit test
- Would be easier to abstract each 'checks' into a function and unit test each function individually
- You also won't need long if-elif-else since you're returning at the end of each condition. You could just:
if check_gold_flexer(user_inputs):
return GOLD_FLEXER.value
if check_strong_flexer(user_inputs):
return STRONG_FLEXER.value
etc. That makes this function a lot smaller, testable, and readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reference to @anguschadney's suggestion: Some archetypes depend on heating_source
being one out of multiple options or hot_water_source
having a specific value, e.g., to be a "Good Flexer" the household has to have either an EV, battery, HP or electric storage heater. This would mean I would have 4 x options for UserInputs for the user to be a "Good Flexer", i.e., good_flexer_1 = UserInputs(...), good_flexer_2 = UserInputs(...), good_flexer_3=, etc. For other flex types, there are even more ways of being that type of flexer. So this makes the suggested logic become quite complex as I would have to write out all the possible options for the user inputs for each archetype.
To simplify & clarify the code I have instead taken @shengy90's suggestion here. For the functions in scoring.py
, however, I have taken your suggested approach and created static classes for all the different instances which map to output values.
from src.smart_building_rating_calculator.flex_archetype import FlexArchetypes | ||
from src.smart_building_rating_calculator.main import sbr_score | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would test the individual scoring functions as well as the archetypes, otherwise you might miss edge cases (e.g. the archetypes do not cover all the possible combinations)
P.s. copilot will generate all of them for you if you write a comment like:
# Test all combinations of score_...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - related to my above feedback of either abstracting out the scoring logic and test it one by one etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests to implement this - let me know if tests can be better written (the test all combinations functions is quite complex)
secondary_heating: bool, | ||
secondary_hot_water: bool, | ||
integrated_control_sys: bool, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return typing:
from typing import Tuple
def test_fun(x: int) -> Tuple[int, int, int]:
return 1,2,3
Something like that - can't remember exact syntax but that's how you can type multiple returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also have a think about the user experience with this function. For someone that has pip-installed this library, what's their experience like calling this function? At the moment, they have to run a lot of import statements just to be able to call this function.
It'd be much easier for this function to take UserInput
class the input. Have a separate function that takes in 'user inputs' to instantiate the UserInput
class. Also have a think of how user will be creating the inputs.
For web clients, these would likely be in a JSON object/ string/ int values. A wrapper function to convert this string inputs into UserInput object which is then parsed into this function would be more convenient for end users.
You can then also write unit tests to test that inputs are parsed properly.
Implementation could be very similar to Gus's feedback on "instantiate the UserInputs dataclass for each flex archetype and then check for equality".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also avoid naming this script main.py
.
main.py
or app.py
is usually reserved for an application where main.py
/app.py
are the entry points to said app.
I would call this script "calculate_sbr_score.py`.
Think about the common python libraries that you use - I don't think you'll have ever run from package.main import fn
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed the script to calculate_sbr_score.py
as recommended and reduced the number of imports required to run script.
tests/test_sbr_calculator.py
Outdated
from src.smart_building_rating_calculator.main import sbr_score | ||
|
||
|
||
def test_high_sbr_gold_flexer(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these tests are testing the same function. I'd wrap them up in 1 test class:
class Test_sbr_score:
def test_gold_flexer():
def test_strong_flexer():
etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also run tests for the intermediate functions in between - and not rely on just a single test on the final function, i.e. 'unit' test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more extensive testing functions & units tests.
Agreed with Gus's comments! Also I think we should break the functions down a bit more and write proper tests. Right now we're only testing 'end-to-end' which isn't quite 'unit test'! |
tests/test_sbr_calculator.py
Outdated
class TestScoreCalculators: | ||
"""Test the scoring calculations using the results in the SBR excel sheet created by ESC""" | ||
|
||
def test_smartest_home(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proper way to do this is to parametrize the test by:
- Creating these UserInputs as pytest fixtures
- Call these fixtures and parametrize the test and compare it against expected results
https://engineeringfordatascience.com/posts/pytest_fixtures_with_parameterize/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks, I have now implemented fixtures as arguments.
tests/test_sbr_calculator.py
Outdated
secondary_hot_water=True, | ||
integrated_control_sys=True, | ||
) | ||
elec_scores = calc_electrification_score(user_inputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be unit testing these functions. This test right now is testing way too many things at once. If there's a bug in one of this function, this test would fail without giving developers any clue as to where to even begin troubleshooting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if better now please. Thanks!
tests/test_sbr_calculator.py
Outdated
elec_scores.insert(0, smart_meter_score) | ||
elec_scores.insert(len(elec_scores), ics_score) | ||
|
||
expected_scores = [1, 3, 3, 1, 3, -1, 4, 2, 1, 1, 1.5] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of unit tests is not just to test correct input = correct output, but also to add syntax to explain to people what the tests are actually checking.
Writing expected scores like these are not giving developers any syntax on what these numbers mean or why we should be expecting this results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if better now please. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I am taking the expected scores from ESC's spreadsheet - I don't know what the numbers mean or why we should be expecting this results, other than I expect these values based on the calculations ESC have provided!
tests/test_sbr_calculator.py
Outdated
assert flex_archetype == FlexArchetype.LOW_TECH_FLEXER | ||
|
||
|
||
def test_all_combinations(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like this test:
- It's not clear what this test is trying to test for
- If you're writing that many levels of nested for/if statements, that should be a bit of an alarm bell already!
- Unit test is meant to test 1 small thing at a time to provide clues to where bugs are. This test is testing a lot of things at once that if broken, doesn't actually provide clues to developers where to debug, i.e. not serving its purpose.
If for some reason you really need to do that many for loops, a more pythonic way is to use itertools.product to get the cartesian product of lists:
https://note.nkmk.me/en/python-itertools-product/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is to check that all possible inputs are converted to valid outputs. This test actually flagged some bugs in the code where some combinations of inputs gave weird outputs. so was a helpful test! I will remove for loops for cartesian products.
Response to second round of review comments
return user_inputs | ||
|
||
|
||
@pytest.mark.parametrize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: You could also test for negative cases too and check that errors are expected:
https://docs.pytest.org/en/8.1.x/how-to/skipping.html#skip-xfail-with-parametrize
…uilding-rating-calculator into sbr-calculation
tests/test_sbr_calculator.py
Outdated
or (inputs[4] and inputs[5] == BatterySize.NONE) | ||
or (inputs[6] and inputs[7] == SolarInverterSize.NONE) | ||
): | ||
with pytest.raises(AssertionError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear exactly what this assertion test is testing for by looking at this code. Looking at the function sbr_score
, that function is not running any assertion tests.
So it's not clear why I'm expecting certain inputs to be failing here. If assertion tests are failing even further upstream of sbr_score
, then why are we not testing at source but nesting assertion tests so deeply?
I had to trace all the functions called by sbr_score
to actually find out assertion tests are carried out in prep_user_input
. These assertion tests should be written as a test to test prep_user_input
directly instead of having these tests here.
The whole point of unit tests is so that when test fails, we can quickly identify where the source of bug is. Right now it's not apparent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks Sheng. I now test on prep_user_input
directly
tests/test_sbr_calculator.py
Outdated
with pytest.raises(AssertionError): | ||
sbr_normalised, sbr, flex_archetype = sbr_score(*inputs) | ||
|
||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better way to write this test is to wrap this up as a pytest Test class:
class TestAllCombination:
@pytest.mark.xfail(raises=AssertionError)
def test_raises_exceptions_for_bad_inputs(self):
"""run logic that will fail test"""
def test_assert_expected_types(self):
assert isinstance(x, float), "x is not float"
assert isinstance(x, str), "x is not str"
def test_assert_value_in_list(self):
assert x in y, "x must be in y"
etc.
This breaks down the test into individual tests, each tests doing 1 thing, so that when things break you can easily find trace back to the source where bugs occur and fixes.
Doing that also makes unit testing work with many IDEs test plugins to quickly navigate failing tests and find out why tests are failing.
The purpose of unit tests is not just for one to check logic, it's also to guard against regression, i.e. if a new developer comes in and make changes and accidentally break codes, it helps them to quickly find out where things are breaking and make fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully better now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to unblock but left non-blocking comments on tests.
Wrote SBR calculation pipeline based on ESC's calculations: https://docs.google.com/spreadsheets/d/1LqUzYlYcvkqyx9c1_ZokSeoSzK7HsX9g/edit#gid=1184908035
Ian should use
main.py
script to output SBR, SBR values and Flex Archetype for a given set of user inputs.