You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using pandas 2.0 causes query_df to return DataFrames where the columns corresponding to Date/DateTime/DateTime64 have different dtypes than those when using pandas 1.5.3 as shown in the table below:
Clickhouse column type | query_df dtype with pandas==1.5.3 | query_df dtype with pandas==2.0
---------------------------------|-----------------------------------|--------------------------------
Date | datetime64[ns] | datetime64[s]
DateTime | datetime64[ns] | datetime64[s]
DateTime('America/Chicago') | datetime64[ns] | datetime64[s]
DateTime64(6) | datetime64[ns] | datetime64[us]
DateTime64(6, 'America/Chicago') | datetime64[ns] | datetime64[us]
Steps to reproduce
Create a Clickhouse table with columns of the above types, insert values, query using query_df.
Expected behaviour
The dtypes of the DataFrame returned by query_df are consistent regardless of pandas version used.
Configuration
Environment
clickhouse-connect version: 0.5.12
Python version: 3.9.13
pandas versions: 1.5.3 and 2.0
Operating system: Red Hat Enterprise Linux 8
ClickHouse server
ClickHouse Server version: 22.3.9
The text was updated successfully, but these errors were encountered:
I see. Do you plan on modifying the behavior of query_df to return consistent dtypes across different versions of pandas, regardless of whether those dtypes are the old ones or the new (and more correct) ones?
I'll have to dig into it, I'm don't know enough about the differences between Pandas versions. My first thought is that all datetime types in Pandas 1.x are given a dtype datetime[ns] (since nanoseconds is always the underlying granularity of th underlying type of a pandas Timestamp object), and that might be different in the new Pandas version. In that case I'd be inclined to keep the new and arguably better behavior.
I'll have to dig into it, I'm don't know enough about the differences between Pandas versions. My first thought is that all datetime types in Pandas 1.x are given a dtype datetime[ns] (since nanoseconds is always the underlying granularity of th underlying type of a pandas Timestamp object), and that might be different in the new Pandas version. In that case I'd be inclined to keep the new and arguably better behavior.
Describe the bug
Using pandas 2.0 causes
query_df
to returnDataFrame
s where the columns corresponding to Date/DateTime/DateTime64 have different dtypes than those when using pandas 1.5.3 as shown in the table below:Steps to reproduce
Create a Clickhouse table with columns of the above types, insert values, query using
query_df
.Expected behaviour
The dtypes of the
DataFrame
returned byquery_df
are consistent regardless of pandas version used.Configuration
Environment
ClickHouse server
The text was updated successfully, but these errors were encountered: