fix for boolean type conversion #78

akurdyukov · 2024-03-05T10:28:01Z

Fix for #77

bryzgaloff

Hi @akurdyukov and thank you for your contribution! Please excuse me for a late reply.

I have added several comments. However, I believe the solution to this should be more fundamental: we should hard-code clickhouse_driver.Client's signature into the plugin's code, so that the code fully controls all of the arguments (in contrast to a pretty faulty **connection_kwargs approach) and converts them to proper types. This will make the behaviour more predictable.

In other words, I believe there should be a preliminary PR which replaces Client(**…) with Client(host=…, …) following its signature from clickhouse-driver here:

airflow-clickhouse-plugin/src/airflow_clickhouse_plugin/hooks/clickhouse.py

Line 54 in c7426a1

return clickhouse_driver.Client(**conn_to_kwargs(conn, self._database))

And then the PR which you currently work on will easily implement handling this specific str-to-bool case.

bryzgaloff · 2024-03-18T13:53:13Z

src/airflow_clickhouse_plugin/hooks/clickhouse.py

+    if val in ('y', 'yes', 't', 'true', 'on', '1'):
+        return 1
+    elif val in ('n', 'no', 'f', 'false', 'off', '0'):
+        return 0


Where do these lists of values come from?

Just a generic list of possible bool values people use

Quite biased. Instead let's support only true/false options. It is impossible to guess what a random user may consider to be a true/false str representation.

Alright, only 'true', 'True', 'fase', 'False' supported

src/airflow_clickhouse_plugin/hooks/clickhouse.py

tests/unit/hooks/test_clickhouse.py

src/airflow_clickhouse_plugin/hooks/clickhouse.py

akurdyukov · 2024-03-19T06:56:30Z

Thanks for the review! I fixed most of review comments.

Regarding the first one about the method of passing arguments to clickhouse_driver.Client - currently there's 23 arguments, most of them are optionals. So, minimal boilerplate version should use something like Pydantic. And it looks like a little overkill to me. What do you think?

bryzgaloff · 2024-04-10T17:22:22Z

src/airflow_clickhouse_plugin/hooks/clickhouse.py

    else:
-        raise ValueError("invalid truth value %r" % (val,))
+        raise ValueError(f'invalid truth value {str_value!r}')


else is redundant here

Sure, fixed

bryzgaloff · 2024-04-10T17:25:43Z

tests/unit/hooks/test_strtobool.py

+    def test_correct_true(self):
+        self.assertTrue(strtobool('true'))
+
+    def test_correct_one(self):
+        self.assertTrue(strtobool('1'))
+
+    def test_correct_false(self):
+        self.assertFalse(strtobool('false'))
+
+    def test_correct_zero(self):
+        self.assertFalse(strtobool('0'))


A quick best practice comment: the tests must cover all the supported input values, not only a few. A full coverage is required. However, this is a boilerplate code: you may use self.subTest functionality to check all truthy and all falsy values in a loop instead of creating a test per value.

bryzgaloff · 2024-04-10T17:27:09Z

src/airflow_clickhouse_plugin/hooks/clickhouse.py

    else:
-        raise ValueError("invalid truth value %r" % (val,))
+        raise ValueError(f'invalid truth value {str_value!r}')


Why "truth" value btw? What if it was intended to be a falsy one? :)

Suggested change

raise ValueError(f'invalid truth value {str_value!r}')

raise ValueError(f'unsupported value: {str_value!r}')

bryzgaloff · 2024-04-10T17:31:10Z

Regarding the first one about the method of passing arguments to clickhouse_driver.Client - currently there's 23 arguments, most of them are optionals. So, minimal boilerplate version should use something like Pydantic. And it looks like a little overkill to me. What do you think?

Hi @akurdyukov, yes, definitely using an external library is an overkill here. I suggest to hardcode all the arguments: clickhouse-driver is not expected to change them often, so we may accept new PRs once the arguments change in the underlying library.

Update to latest upstream

fix for boolean type conversion

650198d

bryzgaloff requested changes Mar 18, 2024

View reviewed changes

review fixes

7055b33

bryzgaloff requested changes Apr 10, 2024

View reviewed changes

akurdyukov added 3 commits August 28, 2024 13:47

Merge pull request #1 from bryzgaloff/master

27a6c64

Update to latest upstream

fix for bool string values

0a015f0

optimized tests used subTest

7f4e3fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix for boolean type conversion #78

fix for boolean type conversion #78

akurdyukov commented Mar 5, 2024

bryzgaloff left a comment •

edited

Loading

bryzgaloff Mar 18, 2024

akurdyukov Mar 18, 2024

bryzgaloff Apr 10, 2024

akurdyukov Sep 1, 2024

akurdyukov commented Mar 19, 2024

bryzgaloff Apr 10, 2024

akurdyukov Sep 1, 2024

bryzgaloff Apr 10, 2024

akurdyukov Sep 1, 2024

bryzgaloff Apr 10, 2024

akurdyukov Sep 1, 2024

bryzgaloff commented Apr 10, 2024

	raise ValueError(f'invalid truth value {str_value!r}')
	raise ValueError(f'unsupported value: {str_value!r}')

fix for boolean type conversion #78

Are you sure you want to change the base?

fix for boolean type conversion #78

Conversation

akurdyukov commented Mar 5, 2024

bryzgaloff left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akurdyukov commented Mar 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bryzgaloff commented Apr 10, 2024

bryzgaloff left a comment •

edited

Loading