Please provide legacy support for FeatureDatum input Tuples #10

Schmed · 2015-11-13T00:22:48Z

There's legacy code out there that expects to call the classify() method with FeatureDatum Tuples, but this method now supports only TermsDatum Tuples. Why not have a legacy getFeatures() method that takes a term map used by a legacy classify() method that takes a TermsDatum, etc.?

kkrugler · 2015-11-13T00:38:25Z

So this getFeatures() method would be part of what class?

Schmed · 2015-11-13T00:45:18Z

I'm referring to the following method within com.scaleunlimited.classify.model.HashedFeaturesLibLinearModel.java:

private Feature[] getFeatures(Map<String, Integer> terms)

I'm suggesting that a second method in the same class be written with the old signature:

private Feature[] getOldFeatures(Map<String, Double> terms)

kkrugler · 2015-11-13T01:02:42Z

HashedFeaturesLibLinearModel extends BaseLibLinearModel, which extends BaseModel<TermsDatum>. So this means the API for addTrainingTerms(TermsDatum) and classify(TermsDatum) have to stay as such. I think you're proposing adding alternative versions to those two methods in BaseLibLinearModel, which take a FeatureDatum argument.

The tricky part is that BaseLibLinearModel maintains the list of TermsDatum documents for training. So I'd either need to have a parallel array (and verify you were only using one or the other), or I could convert the FeatureDatum to a TermsDatum by say multiplying each feature weight by some large fixed constant. That would impose a constraint on the max weight you could provide, but probably OK. Thoughts?

kkrugler · 2015-11-13T01:04:53Z

Or I guess I could convert TermDatum to FeatureDatum in addTrainingTerms, and store that in the list, and bail on ever supporting normalization that works across all training documents. Which is probably OK, as I already pulled out the TfIdfNormalizer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please provide legacy support for FeatureDatum input Tuples #10

Please provide legacy support for FeatureDatum input Tuples #10

Schmed commented Nov 13, 2015

kkrugler commented Nov 13, 2015

Schmed commented Nov 13, 2015

kkrugler commented Nov 13, 2015

kkrugler commented Nov 13, 2015

Please provide legacy support for FeatureDatum input Tuples #10

Please provide legacy support for FeatureDatum input Tuples #10

Comments

Schmed commented Nov 13, 2015

kkrugler commented Nov 13, 2015

Schmed commented Nov 13, 2015

kkrugler commented Nov 13, 2015

kkrugler commented Nov 13, 2015