Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RandomForest to PMML #47

Closed
Clls1 opened this issue Oct 8, 2018 · 4 comments
Closed

RandomForest to PMML #47

Clls1 opened this issue Oct 8, 2018 · 4 comments

Comments

@Clls1
Copy link

Clls1 commented Oct 8, 2018

Hello,

I have run a randomForest however I am not able to extract the pmml. What could be the issue?
Thank you so much

rf <- randomForest(as.factor(fraude) ~ Total_Monto_A+Total_Saldo_A + as.factor(Tem_Modelo_Equipo) + Total_FaturasZero_R, 
                            data = segmentacao_out_model, ntree = 5,
                            nodesize = 5, importance = TRUE)

Error:

> r2pmml(rf, "rf.pmml")
out 08, 2018 10:40:12 AM org.jpmml.rexp.Main run
INFO: Parsing RDS..
out 08, 2018 10:40:12 AM org.jpmml.rexp.Main run
INFO: Parsed RDS in 36 ms.
out 08, 2018 10:40:12 AM org.jpmml.rexp.Main run
INFO: Initializing default Converter
out 08, 2018 10:40:12 AM org.jpmml.rexp.Main run
INFO: Initialized org.jpmml.rexp.RandomForestConverter
out 08, 2018 10:40:12 AM org.jpmml.rexp.Main run
INFO: Converting..
out 08, 2018 10:40:12 AM org.jpmml.rexp.Main run
SEVERE: Failed to convert
java.lang.IllegalArgumentException: other
	at org.jpmml.rexp.RExpUtil.getDataType(RExpUtil.java:46)
	at org.jpmml.rexp.FormulaUtil.createFormula(FormulaUtil.java:71)
	at org.jpmml.rexp.RandomForestConverter.encodeFormula(RandomForestConverter.java:121)
	at org.jpmml.rexp.RandomForestConverter.encodeSchema(RandomForestConverter.java:70)
	at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:69)
	at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
	at org.jpmml.rexp.Main.run(Main.java:149)
	at org.jpmml.rexp.Main.main(Main.java:97)

Exception in thread "main" java.lang.IllegalArgumentException: other
	at org.jpmml.rexp.RExpUtil.getDataType(RExpUtil.java:46)
	at org.jpmml.rexp.FormulaUtil.createFormula(FormulaUtil.java:71)
	at org.jpmml.rexp.RandomForestConverter.encodeFormula(RandomForestConverter.java:121)
	at org.jpmml.rexp.RandomForestConverter.encodeSchema(RandomForestConverter.java:70)
	at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:69)
	at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
	at org.jpmml.rexp.Main.run(Main.java:149)
	at org.jpmml.rexp.Main.main(Main.java:97)
Error in .convert(tempfile, file, converter, converter_classpath, verbose) : 
  1

@vruusmann
Copy link
Member

java.lang.IllegalArgumentException: other
at org.jpmml.rexp.RExpUtil.getDataType(RExpUtil.java:46)

It means that the JPMML-R library is unable to figure out the data type (what is the PMML equivalent of R's other data type) of one or more columns.

I believe that it's related to the fact that you're performing "cast to factor" operations inside the R formula:

rf <- randomForest(as.factor(fraude) ~ Total_Saldo_A + as.factor(Tem_Modelo_Equipo), data = segmentacao_out_model)

Does the conversion succeed, if you perform those cast operations before the randomForest() function call? For example:

segmentacao_out_model$fraude = as.factor(segmentacao_out_model$fraude)
segmentacao_out_model$Tem_Modelo_Equipo = as.factor(segmentacao_out_model$Tem_Modelo_Equipo)

rf <- randomForest(fraude ~ Total_Saldo_A + Tem_Modelo_Equipo, data = segmentacao_out_model)

@Clls1
Copy link
Author

Clls1 commented Oct 8, 2018

Thanks to your tip I discovered the problem. The thing was that one of the variables was of the type difftime, I converted to numeric and it worked! Thank you so much! Please continue the great work!

@Clls1 Clls1 closed this as completed Oct 8, 2018
@vruusmann
Copy link
Member

The thing was that one of the variables was of the type difftime

Can you provide a reproducible example about using difftime?

The PMML standard provides first-class date/time data types, and is able to do arithmetic with them (eg. calculating the number of days between two dates, the number of seconds between two timestamps etc.). Would be very interested in prototyping something in this area.

@vruusmann
Copy link
Member

Related issues:
jpmml/jpmml-r#8
jpmml/jpmml-r#9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants