Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn about large un-transpileable model elements. #9

Open
rhorrell opened this issue May 11, 2020 · 5 comments
Open

Warn about large un-transpileable model elements. #9

rhorrell opened this issue May 11, 2020 · 5 comments

Comments

@rhorrell
Copy link

Hi Villu,
I saw you still had some work to do to address the smaller model I sent you. Thank you.

I made an attempt to do the huge PMML (178MB) and I am getting the following error:

$ java  -Xms8g -Xmx8g -jar jpmml-transpiler-executable-1.1-SNAPSHOT.jar --xml-input ~/data/pmml/xgb_pmml.xml  --jar-output ~/data/pmml/x.jar
/PMML$1583159071.java:5473: error: code too large
    private final static MiningModel buildMiningModel$20049680() {
                                     ^
1 error
java.io.IOException
        at org.jpmml.codemodel.CompilerUtil.compile(CompilerUtil.java:81)
        at org.jpmml.codemodel.CompilerUtil.compile(CompilerUtil.java:56)
        at org.jpmml.codemodel.CompilerUtil.compile(CompilerUtil.java:49)
        at org.jpmml.transpiler.TranspilerUtil.compile(TranspilerUtil.java:75)
        at org.jpmml.transpiler.Main.run(Main.java:115)
        at org.jpmml.transpiler.Main.main(Main.java:98)

Any thoughts?

@vruusmann
Copy link
Member

Now this is a proper "code too large" compiler error!

I can see that you're attempting to transpile an XGBoost model, and the error happens in relation to a MiningModel element. There are two of those - the top-level one (implementing a modelChain functionality) and inner one(s) (implementing a sum functionality).

Do you have any idea which MiningModel element is causing this error? My guess is it's the inner one(s).

The PMML class model object is easily measurable. The workaround is to split a big method into smaller methods. A good threshold metric is the position of a decision tree model (eg. split method after every 100 decision trees).

@vruusmann
Copy link
Member

The method name buildMiningModel indicates that JPMML-Transpiler failed to transpile a MiningModel element into a JavaModel pseudo-element, and is now attempting to generate application code for creating an org.dmg.pmml.mining.MiningModel class model object programmatically.

Something like this:

static
public MiningModel buildMiningModel(){
  return new MiningModel().setMiningSchema(new MiningSchema()).setSegmentation(new Segmentation());
} 

To solve this "code too large" error you must restructure your XGBoost PMML file so that it would become transpile-able.

The important thing to note is that XGBoost models can be represented in PMML in two ways - the default representation (missing value handling uses the Node@defaultChild attribute) and the compact representation.

The JPMML-Transpiler library can only transpile the compact representation right now.

I believe that your XGBoost model is in the default representation. If you re-export or re-code it into the compact representation, then the transpilation should succeed without any "code too large" errors.

@vruusmann vruusmann changed the title error: code too large Warn about large un-transpileable model elements. May 11, 2020
@vruusmann
Copy link
Member

Example XGBoost model in the compact representation: https://github.com/jpmml/jpmml-transpiler/blob/1.1.0/src/test/resources/pmml/XGBoostAuditNA.pmml

@rhorrell
Copy link
Author

Hi Villu,
I was able to Transpiler a smaller XGB model but I am getting the following error

$ java -jar pmml-evaluator-example-executable-1.5-SNAPSHOT.jar --model ~/data/pmml/xgb_pmml.jar --input ~/data/input/xgb_testdata.csv --output  /dev/null >> log2.log
Picked up JAVA_TOOL_OPTIONS:  -Xms8g -Xmx8g
Exception in thread "main" java.lang.IllegalArgumentException
        at org.jpmml.evaluator.regression.RegressionModelUtil.normalizeBinaryLogisticClassificationResult(RegressionModelUtil.java:198)
        at org.jpmml.evaluator.regression.RegressionModelUtil.computeBinomialProbabilities(RegressionModelUtil.java:46)
        at PMML$1583159071$JavaModel$695248316.evaluateRegressionTableList$768194342(PMML$1583159071.java)
        at PMML$1583159071$JavaModel$695248316.evaluateClassification(PMML$1583159071.java)
        at org.jpmml.evaluator.java.JavaModelEvaluator.evaluateClassification(JavaModelEvaluator.java:56)
        at org.jpmml.evaluator.ModelEvaluator.evaluateInternal(ModelEvaluator.java:468)
        at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateSegmentation(MiningModelEvaluator.java:540)
        at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateClassification(MiningModelEvaluator.java:306)
        at org.jpmml.evaluator.ModelEvaluator.evaluateInternal(ModelEvaluator.java:468)
        at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateInternal(MiningModelEvaluator.java:239)
        at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:297)
        at org.jpmml.evaluator.example.EvaluationExample.execute(EvaluationExample.java:418)
        at org.jpmml.evaluator.example.Example.execute(Example.java:92)
        at org.jpmml.evaluator.example.EvaluationExample.main(EvaluationExample.java:262)

Is this the other issue you were working on?

@vruusmann
Copy link
Member

@rhorrell This exception is definitely separate from the current "code too large" issue. Please open a new issue for each unique exception!

Exception in thread "main" java.lang.IllegalArgumentException
org.jpmml.evaluator.regression.RegressionModelUtil.normalizeBinaryLogisticClassificationResult(RegressionModelUtil.java:198)

The transpiled RegressionModel element specifies a link function that does not seem to be permitted according to PMML 4.4 standard:
https://github.com/jpmml/jpmml-evaluator/blob/1.5.1/pmml-evaluator/src/main/java/org/jpmml/evaluator/regression/RegressionModelUtil.java#L184-L198

You should be getting the same exception when evaluating this XGBoost model with the JPMML-Evaluator library in the normal "interpreted" mode.

Most likely, your XGBoost model is invalid. What is/was your XGBoost-to-PMML conversion tool?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants