Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade from 40 to 43 causes utf8 timestamp queries to fail #13625

Closed
Tracked by #13648
TheBuilderJR opened this issue Dec 3, 2024 · 12 comments
Closed
Tracked by #13648

Upgrade from 40 to 43 causes utf8 timestamp queries to fail #13625

TheBuilderJR opened this issue Dec 3, 2024 · 12 comments
Labels
bug Something isn't working

Comments

@TheBuilderJR
Copy link
Contributor

Describe the bug

I had a column timestamp that is UTF8

        {
            "name": "timestamp",
            "data_type": "Utf8",
            "nullable": true,
            "dict_id": 0,
            "dict_is_ordered": false,
            "metadata": {}
        }

Previously if i ran a query like

SELECT * FROM server_logs WHERE timestamp IS NOT NULL ORDER BY timestamp DESC LIMIT 3

datafusion would return with no errors, but after upgrading to 43 I get

 Failed to execute SQL query

Caused by:
    Error during planning: Error during planning: Coercion from [Utf8, Utf8View] to the signature OneOf([Exact([Utf8, Timestamp(Nanosecond, None)]), Exact([Utf8View, Timestamp(Nanosecond, None)]), Exact([Utf8, Timestamp(Nanosecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Nanosecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Millisecond, None)]), Exact([Utf8View, Timestamp(Millisecond, None)]), Exact([Utf8, Timestamp(Millisecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Millisecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Microsecond, None)]), Exact([Utf8View, Timestamp(Microsecond, None)]), Exact([Utf8, Timestamp(Microsecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Microsecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Second, None)]), Exact([Utf8View, Timestamp(Second, None)]), Exact([Utf8, Timestamp(Second, Some("+TZ"))]), Exact([Utf8View, Timestamp(Second, Some("+TZ"))]), Exact([Utf8, Date64]), Exact([Utf8View, Date64]), Exact([Utf8, Date32]), Exact([Utf8View, Date32]), Exact([Utf8, Time32(Second)]), Exact([Utf8View, Time32(Second)]), Exact([Utf8, Time32(Millisecond)]), Exact([Utf8View, Time32(Millisecond)]), Exact([Utf8, Time64(Microsecond)]), Exact([Utf8View, Time64(Microsecond)]), Exact([Utf8, Time64(Nanosecond)]), Exact([Utf8View, Time64(Nanosecond)]), Exact([Utf8, Interval(YearMonth)]), Exact([Utf8View, Interval(YearMonth)]), Exact([Utf8, Interval(DayTime)]), Exact([Utf8View, Interval(DayTime)]), Exact([Utf8, Interval(MonthDayNano)]), Exact([Utf8View, Interval(MonthDayNano)]), Exact([Utf8, Duration(Second)]), Exact([Utf8View, Duration(Second)]), Exact([Utf8, Duration(Millisecond)]), Exact([Utf8View, Duration(Millisecond)]), Exact([Utf8, Duration(Microsecond)]), Exact([Utf8View, Duration(Microsecond)]), Exact([Utf8, Duration(Nanosecond)]), Exact([Utf8View, Duration(Nanosecond)])]) failed.

To Reproduce

Create a similar looking schema and run a query against it

Expected behavior

No errors

Additional context

No response

@TheBuilderJR TheBuilderJR added the bug Something isn't working label Dec 3, 2024
@holicc
Copy link

holicc commented Dec 3, 2024

@TheBuilderJR Cannot reproduce this problem on version 43.0.0. Maybe something is missing here.

 let df = ctx.sql("create table logs(timestamp varchar);").await?;
 let df = ctx
        .sql(r#"SELECT * FROM logs WHERE timestamp IS NOT NULL ORDER BY timestamp DESC LIMIT 3"#)
        .await?;
 println!("{:?}",df.schema());
 df.show().await?;

got results:

DFSchema { inner: Schema { fields: [Field { name: "timestamp", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [Some(Bare { table: "logs" })], functional_dependencies: FunctionalDependencies { deps: [] } }
+-----------+
| timestamp |
+-----------+

@findepi
Copy link
Member

findepi commented Dec 3, 2024

Caused by:
Error during planning: Error during planning: Coercion from [Utf8, Utf8View] to the signature OneOf([Exact([Utf8, Timestamp(Nanosecond, None)]), Exact([Utf8View, Timestamp(Nanosecond, None)]), Exact([Utf8, Timestamp(Nanosecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Nanosecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Millisecond, None)]), Exact([Utf8View, Timestamp(Millisecond, None)]), Exact([Utf8, Timestamp(Millisecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Millisecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Microsecond, None)]), Exact([Utf8View, Timestamp(Microsecond, None)]), Exact([Utf8, Timestamp(Microsecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Microsecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Second, None)]), Exact([Utf8View, Timestamp(Second, None)]), Exact([Utf8, Timestamp(Second, Some("+TZ"))]), Exact([Utf8View, Timestamp(Second, Some("+TZ"))]), Exact([Utf8, Date64]), Exact([Utf8View, Date64]), Exact([Utf8, Date32]), Exact([Utf8View, Date32]), Exact([Utf8, Time32(Second)]), Exact([Utf8View, Time32(Second)]), Exact([Utf8, Time32(Millisecond)]), Exact([Utf8View, Time32(Millisecond)]), Exact([Utf8, Time64(Microsecond)]), Exact([Utf8View, Time64(Microsecond)]), Exact([Utf8, Time64(Nanosecond)]), Exact([Utf8View, Time64(Nanosecond)]), Exact([Utf8, Interval(YearMonth)]), Exact([Utf8View, Interval(YearMonth)]), Exact([Utf8, Interval(DayTime)]), Exact([Utf8View, Interval(DayTime)]), Exact([Utf8, Interval(MonthDayNano)]), Exact([Utf8View, Interval(MonthDayNano)]), Exact([Utf8, Duration(Second)]), Exact([Utf8View, Duration(Second)]), Exact([Utf8, Duration(Millisecond)]), Exact([Utf8View, Duration(Millisecond)]), Exact([Utf8, Duration(Microsecond)]), Exact([Utf8View, Duration(Microsecond)]), Exact([Utf8, Duration(Nanosecond)]), Exact([Utf8View, Duration(Nanosecond)])]) failed.

i wonder which query operation could have produced this error.
In current code base, "Coercion from ... to the signature" seems to be produced by function-related coecions, but in the query there is no function call, unless logs is a view. In such case, knowing the view definition can be necessary to reproduce the error.

In any case, here is a proposal to improve the error message: #13628

@TheBuilderJR
Copy link
Contributor Author

TheBuilderJR commented Dec 3, 2024

Thanks @holicc and @findepi I think I looked at the wrong query in the logs. You can reproduce with something like this

SELECT 
                EXTRACT(YEAR FROM timestamp) as year,
                EXTRACT(MONTH FROM timestamp) as month,
                COUNT(*) as count
             FROM logs
             WHERE timestamp IS NOT NULL 
             GROUP BY 
                EXTRACT(YEAR FROM timestamp),
                EXTRACT(MONTH FROM timestamp)
             ORDER BY year, month

which produces

Caused by:
    Error during planning: Error during planning: Coercion from [Utf8, Utf8View] to the signature OneOf([Exact([Utf8, Timestamp(Nanosecond, None)]), Exact([Utf8View, Timestamp(Nanosecond, None)]), Exact([Utf8, Timestamp(Nanosecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Nanosecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Millisecond, None)]), Exact([Utf8View, Timestamp(Millisecond, None)]), Exact([Utf8, Timestamp(Millisecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Millisecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Microsecond, None)]), Exact([Utf8View, Timestamp(Microsecond, None)]), Exact([Utf8, Timestamp(Microsecond, Some("+TZ"))]), Exact([Utf8View, Timestamp(Microsecond, Some("+TZ"))]), Exact([Utf8, Timestamp(Second, None)]), Exact([Utf8View, Timestamp(Second, None)]), Exact([Utf8, Timestamp(Second, Some("+TZ"))]), Exact([Utf8View, Timestamp(Second, Some("+TZ"))]), Exact([Utf8, Date64]), Exact([Utf8View, Date64]), Exact([Utf8, Date32]), Exact([Utf8View, Date32]), Exact([Utf8, Time32(Second)]), Exact([Utf8View, Time32(Second)]), Exact([Utf8, Time32(Millisecond)]), Exact([Utf8View, Time32(Millisecond)]), Exact([Utf8, Time64(Microsecond)]), Exact([Utf8View, Time64(Microsecond)]), Exact([Utf8, Time64(Nanosecond)]), Exact([Utf8View, Time64(Nanosecond)]), Exact([Utf8, Interval(YearMonth)]), Exact([Utf8View, Interval(YearMonth)]), Exact([Utf8, Interval(DayTime)]), Exact([Utf8View, Interval(DayTime)]), Exact([Utf8, Interval(MonthDayNano)]), Exact([Utf8View, Interval(MonthDayNano)]), Exact([Utf8, Duration(Second)]), Exact([Utf8View, Duration(Second)]), Exact([Utf8, Duration(Millisecond)]), Exact([Utf8View, Duration(Millisecond)]), Exact([Utf8, Duration(Microsecond)]), Exact([Utf8View, Duration(Microsecond)]), Exact([Utf8, Duration(Nanosecond)]), Exact([Utf8View, Duration(Nanosecond)])]) failed. No function matches the given name and argument types 'date_part(Utf8, Utf8View)'. You might need to add explicit type casts.
        Candidate functions:
        date_part(Utf8, Timestamp(Nanosecond, None))
        date_part(Utf8View, Timestamp(Nanosecond, None))
        date_part(Utf8, Timestamp(Nanosecond, Some("+TZ")))
        date_part(Utf8View, Timestamp(Nanosecond, Some("+TZ")))
        date_part(Utf8, Timestamp(Millisecond, None))
        date_part(Utf8View, Timestamp(Millisecond, None))
        date_part(Utf8, Timestamp(Millisecond, Some("+TZ")))
        date_part(Utf8View, Timestamp(Millisecond, Some("+TZ")))
        date_part(Utf8, Timestamp(Microsecond, None))
        date_part(Utf8View, Timestamp(Microsecond, None))
        date_part(Utf8, Timestamp(Microsecond, Some("+TZ")))
        date_part(Utf8View, Timestamp(Microsecond, Some("+TZ")))
        date_part(Utf8, Timestamp(Second, None))
        date_part(Utf8View, Timestamp(Second, None))
        date_part(Utf8, Timestamp(Second, Some("+TZ")))
        date_part(Utf8View, Timestamp(Second, Some("+TZ")))
        date_part(Utf8, Date64)
        date_part(Utf8View, Date64)
        date_part(Utf8, Date32)
        date_part(Utf8View, Date32)
        date_part(Utf8, Time32(Second))
        date_part(Utf8View, Time32(Second))
        date_part(Utf8, Time32(Millisecond))
        date_part(Utf8View, Time32(Millisecond))
        date_part(Utf8, Time64(Microsecond))
        date_part(Utf8View, Time64(Microsecond))
        date_part(Utf8, Time64(Nanosecond))
        date_part(Utf8View, Time64(Nanosecond))
        date_part(Utf8, Interval(YearMonth))
        date_part(Utf8View, Interval(YearMonth))
        date_part(Utf8, Interval(DayTime))
        date_part(Utf8View, Interval(DayTime))
        date_part(Utf8, Interval(MonthDayNano))
        date_part(Utf8View, Interval(MonthDayNano))
        date_part(Utf8, Duration(Second))
        date_part(Utf8View, Duration(Second))
        date_part(Utf8, Duration(Millisecond))
        date_part(Utf8View, Duration(Millisecond))
        date_part(Utf8, Duration(Microsecond))
        date_part(Utf8View, Duration(Microsecond))
        date_part(Utf8, Duration(Nanosecond))
        date_part(Utf8View, Duration(Nanosecond))

@alamb
Copy link
Contributor

alamb commented Dec 3, 2024

Does the problem go away if you turn off this config setting:

https://datafusion.apache.org/user-guide/configs.html

datafusion.execution.parquet.schema_force_view_types

We are still working through some additional needed support:

@TheBuilderJR
Copy link
Contributor Author

@alamb yep that fixes it

        let config = SessionConfig::new()
            .set_bool("datafusion.execution.parquet.schema_force_view_types", false);

        // Create a SessionState using the config and runtime_env
        let state = SessionStateBuilder::new()
            .with_config(config)
            .with_default_features()
            .build();

        // Create a SessionContext
        let ctx = SessionContext::from(state);

I think one small frustration I've had with datafusion is the amount of backwards breaking changes. Is it fair to say that datafusion isn't ready for production yet? Are there any active plans to add a more comprehensive test suite so users can feel confident more confident with the updates? Or perhaps are there any config settings that I can opt into that trades off stability for performance?

@Omega359
Copy link
Contributor

Omega359 commented Dec 4, 2024

@alamb yep that fixes it

        let config = SessionConfig::new()
            .set_bool("datafusion.execution.parquet.schema_force_view_types", false);

        // Create a SessionState using the config and runtime_env
        let state = SessionStateBuilder::new()
            .with_config(config)
            .with_default_features()
            .build();

        // Create a SessionContext
        let ctx = SessionContext::from(state);

My guess is that you are being impacted by the issue being worked on in #13404 - Edit: no, in fact I am unsure as to whether I've seen this issue before - string -> timestamp implicit casting failing.

I think one small frustration I've had with datafusion is the amount of backwards breaking changes. Is it fair to say that datafusion isn't ready for production yet? Are there any active plans to add a more comprehensive test suite so users can feel confident more confident with the updates? Or perhaps are there any config settings that I can opt into that trades off stability for performance?

As far as tests there is this: #13470 however I don't believe that will cover a large portion of what is breaking between releases for people, at least not initially.

As far as being production ready I suppose that depends on your definition of production ready. It's being used in production by quite a few companies for some time now but upgrades do seem to have been a concern - #13525

@Omega359
Copy link
Contributor

Omega359 commented Dec 4, 2024

So, I just ran the following against main as of a few days ago:

:/apache_datafusion# docker run --rm -it datafusion-cli
DataFusion CLI v43.0.0
> create table logs(timestamp varchar);
0 row(s) fetched. 
Elapsed 0.003 seconds.

> insert into logs values ('2023-01-01T09:00:02');
+-------+
| count |
+-------+
| 1     |
+-------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

> SELECT 
                EXTRACT(YEAR FROM timestamp) as year,
                EXTRACT(MONTH FROM timestamp) as month,
                COUNT(*) as count
             FROM logs
             WHERE timestamp IS NOT NULL 
             GROUP BY 
                EXTRACT(YEAR FROM timestamp),
                EXTRACT(MONTH FROM timestamp)
             ORDER BY year, month;
+------+-------+-------+
| year | month | count |
+------+-------+-------+
| 2023 | 1     | 1     |
+------+-------+-------+
1 row(s) fetched. 
Elapsed 0.012 seconds.

@findepi
Copy link
Member

findepi commented Dec 4, 2024

@TheBuilderJR can you please check whether the reproducer query that did work in DF 40 doesn't work in current main?
It might be the bug has been fixed and we can close the issue.

@TheBuilderJR
Copy link
Contributor Author

@findepi It was broken on a dec2 fork I maintain (https://github.com/TheBuilderJR/datafusion/tree/main-dec-2-fork). Is there a PR that you have that landed past that date?

@findepi
Copy link
Member

findepi commented Dec 4, 2024

I don't have any particular PR in mind, but if you have reproducer that still doesn't work on main, then it is something we can help with.

@TheBuilderJR
Copy link
Contributor Author

@findepi if you can't repro on main anymore but you can from before dec2 then I'm happy to close it out. I'm not sure if the repro above is exactly the same since I'm not sure if varchar gets converted into utf8 type. I'm unblocked now thanks to @alamb's suggestion so it's nbd either way. If you insist on closing this out I'm ok with that as well :)

@alamb
Copy link
Contributor

alamb commented Dec 17, 2024

Closing this as it appears to have been fixed on main. Please reopen if there is something else you find wrong.

@alamb alamb closed this as completed Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants