-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to convert single character to Enum using parser #998
Comments
Ah it's because we added Char recognition to our parser system.
It should work if you write I'll have a look whether we can pass Chars (as String) through the parser function as well, that way we can keep the previous notation working. Also, I'll add some tests to catch it in the future. Thanks! |
When using .convertTo<DataSchemaType>{
parser { ACHType.fromSymbol(it) }
parser { other type parsing from it}
} Here, |
In the original post, I think the problem is the It did not work as other parsers. |
@hantsy yes that's true. Could you share I suspect Reading CSV can now give We'd have to change |
I create an example project to produce the issue, https://github.com/hantsy/dataframe-sandbox/blob/master/src/main/kotlin/Main.kt When the mainKt, it will display the following info. 12:58:11.221 [pool-1-thread-4] DEBUG org.jetbrains.kotlinx.dataframe.impl.io.FastDoubleParser -- Could not parse 'N' as Double with NumberFormat with locale 'en_US'.
Amount: Double
Last 4: Int
ACH Type: Char
Exception in thread "main" org.jetbrains.kotlinx.dataframe.exceptions.TypeConverterNotFoundException: Type converter from kotlin.Char to org.example.ACHType? is not found for column 'ACH Type'
at org.jetbrains.kotlinx.dataframe.impl.api.ConvertKt.convertToTypeImpl$convertPerCell(convert.kt:189)
at org.jetbrains.kotlinx.dataframe.impl.api.ConvertKt.convertToTypeImpl(convert.kt:209)
at org.jetbrains.kotlinx.dataframe.api.ConvertKt.convertTo(convert.kt:137)
at org.jetbrains.kotlinx.dataframe.impl.api.ConvertToKt.convertToImpl$convertToSchema(convertTo.kt:175)
at org.jetbrains.kotlinx.dataframe.impl.api.ConvertToKt.convertToImpl(convertTo.kt:277)
at org.example.MainKt.main(Main.kt:51)
at org.example.MainKt.main(Main.kt)
Process finished with exit code 1
|
it seems this is a free-style conversion for developers. |
What do you mean with "free-style" conversion?
Ah yes, as I expected. Just to check we both understand the situation: Your CSV is parsed as Then you want to convert it to As you can see in the docs, you can provide your strategies to the DSL of how it can convert type In the DSL you write: .convertTo<CsvExampleDataModel> {
// freely convert
parser { ACHType.fromSymbol(it) }
parser { BigDecimal(it) }
} which tells dataframe how to convert Were you to add the strategy So far the current situation. To make this situation easier in the future, I suggest either of the following changes:
Which would you think is the best solution? (On a slightly related note, if you want your CSV to not parse String -> Char, but instead keep them as Strings, as in older versions, you can now supply |
It means I can not convert the original string value to my type as expected and ignore the Dataframe built-in conversion. |
It breaks the existing rules, currently I found when using |
I am not sure which one is better. But I think the original (of course, I do not know if it is the design purpose. If it is not, as a developer, I think it is better to leave such room for developers to handle conversion manually instead of applying built-in converters) |
readCsv
val df = DataFrame.readCsv(file,
colTypes = mapOf(
"Amount" to ColType.Double, // per column name
ColType.DEFAULT to ColType.String, // or for all columns at once
),
) After calling this example, we'll get a dataFrame with convertYou can, however, perform a val newDf = df.convert { "ACH Type"<String>() }.with { ACHType.fromSymbol(it) } This results in a new dataframe with the columns convertToAnother operation is Calling val result1 = newDf.convertTo<CsvExampleDataModel>() tells DataFrame to do the following conversions: which it can do just fine, because we support automatic type conversion between any number types. However, calling this on the previous dataframe: df.convertTo<CsvExampleDataModel>() will fail, because it tells DataFrame to do: You can fix this by telling DataFrame how to do this type of conversion like: df.convertTo<CsvExampleDataModel> {
convert<String>().with { ACHType.fromSymbol(it) }
} this tells DataFrame to do: This succeeds :) An alternate notation which does exactly the same is: df.convertTo<CsvExampleDataModel> {
parser { ACHType.fromSymbol(it) }
}
Hopefully this explanation helps to understand how these operations work and how they can be used. All examples I just wrote above can be executed in this notebook if you want to get a better feeling for them. |
Thanks for your explanation. In my example, when reading the CSV, I did not add The
I want to convert to the expected types defined in
I call .convertTo<CsvExampleDataModel> {
// freely convert
parser { ACHType.fromSymbol(it) } // `it` always refers to a String type
parser { BigDecimal(it) }
} The The Why |
Anyway, using |
In your example, you can remove the line In this example, So, again, to fix the example, you should write: DataFrame.readCsv(...)
.convertTo<CsvExampleDataModel> {
// supplying convertTo the function (Char) -> ACHType it can use when needed
convert<Char>().with { ACHType.fromSymbol(it.toString) }
} |
Got it, thanks. |
After upgrading to v0.15, a new issue occurred in our application.
There is a column that accepts a single character in the CSV, and we use the following parser to convert it to
Enum
from a raw string directly.which worked well with the former version, but failed now with the following information.
The text was updated successfully, but these errors were encountered: