Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OrdinalDomain does not take real ordinal values in interval of length less than 2 #82

Open
kmundnic opened this issue Nov 23, 2017 · 1 comment

Comments

@kmundnic
Copy link

kmundnic commented Nov 23, 2017

Hi,
OrdinalDomain only takes Ints as min and max values, but this does not work in the case where ordinal values may be Floats. For example, having ordinal values [0.0, 0.1, 0.2, 0.3] will result in an inaccurate warn (domains.jl:40) and consequent error.

The way I'm thinking to solve this is to define

# Ordinal data should take integer values ranging from `min` to `max`
immutable OrdinalDomain<:Domain
	min::Real
	max::Real
	function OrdinalDomain(elements)
		if length(elements) < 2
			warn("The ordinal variable you've created is degenerate: it has only two levels. Consider using a Boolean variable instead; ordinal loss functions may have unexpected behavior on a degenerate ordinal domain.")
		end
		return new(minimum(elements), maximum(elements))
	end
end

Any comments/concerns on this fix?

@madeleineudell
Copy link
Owner

Right now, the min and max values are used to index into a vector; they have to be Ints! The simple fix is to transform your data, mapping your ordinal values to consecutive integers.

The more complex fix would be to do this internally inside the GLRM. You'd probably have to add a new field with the original data, and define functions mapping each column back and forth from the original domain to the transformed domain and back. Let glrm.A be the transformed data, glrm.df be the original data. Methods like impute, impute_missing, sample, and sample_missing should first impute glrm.A, then transform the data back to the original coordinates using the define map, and return glrm.df.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants