sqlserver_column.py: Handle string dtype of nvarchar #606
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
nvarchar is not considered a string in the base adapter, nor in the fabric adapter. This leads to issues when dealing with nvarchar columns, such as in https://github.com/dbt-msft/dbt-sqlserver-utils/blob/master/macros/sql/union.sql
In this commit we add nvarchar to the list of string dtypes, return the correct string_size for nvarchar (which is doubled by default), and use the correct dtype in string_type.
I believe this mishandling of nvarchar is the root cause of #446
I ran into this issue when unioning multiple relations which used nvarchar. The generated code had no string length specified for the nvarchar fields, which resulted in the default string length of 30 being used.
I'm not sure if how I handle string_type is best, as it is a class method in the base class, but I think we want it to be instance specific as that is how the dtypes are handled.
Additionally, I'm not sure why char_size is twice the string length for nvarchar but I noticed it in my testing which is why I halve it in string_size(). I do understand that nvarchar uses twice the bytes as varchar, so suspect it is related, but I don't know if we would want to fix char_size to be the actual number of characters or just fix it in string_size.