Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of ISO8601 timestamp strings for inserting into table with DateTime64 column type #439

Open
rnv812 opened this issue Dec 15, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@rnv812
Copy link
Contributor

rnv812 commented Dec 15, 2024

Current implementation gives us the opportunity to pass DateTime64 value as int or datetime object. I think it would be really helpful if it can be possible to pass DateTime64 value as str in ISO8601 format (e.g. 2024-12-15T18:14:17.206854).

In my use case I receive some data from service in json format. Timestamps are stored as strings in ISO8601 format. For now I have to convert each document timestamp using datetime.fromisoformat().

I also notice some code fragment in DateTime64 class which is a potential bug. In clickhouse_connect/datatypes/temporal.py on line 211, microseconds are added to float value which is already includes microseconds.

image

@rnv812 rnv812 added the enhancement New feature or request label Dec 15, 2024
@genzgd
Copy link
Collaborator

genzgd commented Dec 15, 2024

Regarding the "bug", microseconds are added to the (int) of the float value. This is to avoid rounding errors that are inherent in the underlying float value.

As for the enhancement request, it makes sense but such an enhancement would be doing exactly the same thing as you do yourself, and there are many other string formats that are not supported natively by Python. There's no harm in it but it would have the same performance hit.

@rnv812
Copy link
Contributor Author

rnv812 commented Dec 16, 2024

Thank you for your answer,

Oh, now I see (about rounding errors avoiding).

Regarding the topic of datetime formats, I think that support of a datetime standard (ISO8601) which is can be done with standard tools of language generally has a place in the library. I understand your point that the same conversion operation can be performed at the main level code, not at the library level. However, I rather pursued the goal of minimizing the conversion logic in the main code and delegating it to the library (not for all date formats but only for international standard).

For now I chose next solution that delegates parsing to ClickHouse server.

...
response = await self._client.raw_insert(
    table=self._fq_table_name,
    insert_block=('\n'.join(documents) + '\n'),
    fmt='JSONEachRow'
)
...

As the alternative I'm fully satisfied with that. So we can close that issue if implementation of suggestion is not intended.

P.S.
Thank you for your effort in creating this library :)

@genzgd
Copy link
Collaborator

genzgd commented Dec 16, 2024

If you are starting with JSON initially, handing that off to the ClickHouse server using raw_insert makes perfect sense, and you get the benefit of ClickHouse using "best effort" to parse the string values.

As I said, the enhancement makes sense but there are several other requests without a such clean workaround that will be higher priority. I'm happy to leave the issue open, but unless there is a community contribution it probably won't get worked on for quite some time.

rnv812 added a commit to rnv812/clickhouse-connect that referenced this issue Dec 17, 2024
rnv812 added a commit to rnv812/clickhouse-connect that referenced this issue Dec 17, 2024
@rnv812
Copy link
Contributor Author

rnv812 commented Dec 17, 2024

I added PR (#440) that touches only DateTime64 and not other temporal types (not to dive deep into the library structure, but implement minimal valuable feature)

genzgd pushed a commit that referenced this issue Dec 18, 2024
…` type (#440)

* [#439] Add support of ISO8601 timestamp strings for DateTime64 column type

* [#439] Update CHANGELOG.md

* [#440] Remove checking write format for str objects in `DateTime64._write_column_binary`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants