-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: select first matching language name where there are conflicts, and fix null language description #254
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
UPDATE t_iso639_3 SET Part2B=NULL WHERE Part2B=''; | ||
UPDATE t_iso639_3 SET Part2T=NULL WHERE Part2T=''; | ||
UPDATE t_iso639_3 SET Part1=NULL WHERE Part1=''; | ||
UPDATE t_iso639_3 SET _Comment=NULL WHERE _Comment=''; | ||
UPDATE t_iso639_3 SET CanonicalId=COALESCE(CAST(Part1 AS NVARCHAR),CAST(Id AS NVARCHAR)) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
-- | ||
-- We need to do some sanitisation of the t_language_index and t_iso639_3_names | ||
-- to remove names marked as pejorative in the Ethnologue index. | ||
-- | ||
|
||
delete | ||
t_iso639_3_names | ||
where exists (select * from t_ethnologue_language_index el where el.LangID = t_iso639_3_names.Id and (el.nametype='LP' or el.nametype='DP')) | ||
|
||
delete | ||
t_language_index | ||
where exists (select * from t_ethnologue_language_index el where el.LangID = t_language_index.language_id and (el.nametype='LP' or el.nametype='DP')) | ||
|
||
delete from t_ethnologue_language_index where nametype='LP' or nametype='DP'; | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
-- | ||
-- Deprecated keyboards and models should be flagged as such in the t_keyboard/t_model data | ||
-- | ||
|
||
update t_keyboard | ||
set deprecated = 1 | ||
where exists (select * from t_keyboard_related kr where kr.related_keyboard_id = t_keyboard.keyboard_id and kr.deprecates = 1); | ||
|
||
update t_model | ||
set deprecated = 1 | ||
where exists (select * from t_model_related mr where mr.related_model_id = t_model.model_id and mr.deprecates = 1); | ||
|
||
-- | ||
-- Any keyboard that has been replaced by another one, or is not Unicode, is marked as obsolete | ||
-- | ||
|
||
update t_keyboard | ||
set obsolete = 1 | ||
where deprecated = 1 or is_unicode = 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
-- | ||
-- Canonicalize bcp47 codes into langtags entries | ||
-- | ||
-- Fixup those that are missing from t_langtags, first | ||
-- | ||
|
||
-- Find those that are missing where there is a matching base tag but not a matching full tag | ||
|
||
INSERT | ||
t_langtag (tag, [full], iso639_3, region, regionname, name, sldr, script, windows) | ||
SELECT DISTINCT | ||
kl.bcp47, | ||
kl.bcp47, | ||
null, | ||
t.region, | ||
t.regionname, | ||
kl.description, | ||
0, | ||
kl.script_id, | ||
kl.bcp47 | ||
FROM | ||
t_keyboard_language kl LEFT JOIN | ||
t_langtag_tag tt ON kl.bcp47 = tt.tag LEFT JOIN | ||
t_langtag_tag tt0 ON kl.language_id = tt0.tag LEFT JOIN | ||
t_langtag t ON tt0.base_tag = t.tag | ||
WHERE | ||
tt.tag IS NULL AND | ||
tt0.tag IS NOT NULL | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
-- Insert the tags above for searching against | ||
|
||
INSERT | ||
t_langtag_tag (base_tag, tag, tagtype) | ||
SELECT DISTINCT | ||
kl.bcp47, | ||
kl.bcp47, | ||
5 -- custom (keyboard) tag type | ||
FROM | ||
t_keyboard_language kl LEFT JOIN | ||
t_langtag_tag tt ON kl.bcp47 = tt.tag LEFT JOIN | ||
t_langtag_tag tt0 ON kl.language_id = tt0.tag | ||
WHERE | ||
tt.tag IS NULL AND | ||
tt0.tag IS NOT NULL | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
-- Fixup those where we cannot find any matching base tag at all (e.g. qa? tags will fit into this) | ||
|
||
INSERT | ||
t_langtag (tag, [full], iso639_3, region, regionname, name, sldr, script, windows) | ||
SELECT DISTINCT | ||
kl.bcp47, | ||
kl.bcp47, | ||
null, | ||
'001', --t.region, | ||
'World', --t.regionname, | ||
(select top 1 kl0.description from k0.t_keyboard_language kl0 where kl0.bcp47 = kl.bcp47), | ||
0, | ||
kl.script_id, | ||
kl.bcp47 | ||
FROM | ||
t_keyboard_language kl LEFT JOIN | ||
t_langtag_tag tt ON kl.bcp47 = tt.tag | ||
WHERE | ||
tt.tag IS NULL |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
-- Insert the tags above for searching against | ||
|
||
INSERT | ||
t_langtag_tag (base_tag, tag, tagtype) | ||
SELECT DISTINCT | ||
kl.bcp47, | ||
kl.bcp47, | ||
5 -- custom (keyboard) tag type | ||
FROM | ||
t_keyboard_language kl LEFT JOIN | ||
t_langtag_tag tt ON kl.bcp47 = tt.tag LEFT JOIN | ||
t_langtag t ON kl.bcp47 = t.tag | ||
WHERE | ||
tt.tag IS NULL AND | ||
t.tag IS NOT NULL |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
-- Add new names that have been defined by keyboard authors | ||
|
||
INSERT | ||
t_langtag_name (tag, name, name_kd, nametype) | ||
SELECT DISTINCT | ||
t.base_tag, | ||
kl.description, | ||
kl.description, -- TODO: we can't do full normalisation here, but we'll live with it for now | ||
4 -- custom | ||
FROM | ||
t_keyboard_language kl LEFT JOIN | ||
t_langtag_tag t ON kl.bcp47 = t.tag LEFT JOIN | ||
t_langtag_name n ON n.tag = t.base_tag AND n.name = kl.description | ||
WHERE | ||
n._id IS NULL and t.tag is not null |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
-- Finally, match up all the keyboards with langtags! | ||
|
||
INSERT | ||
t_keyboard_langtag | ||
SELECT | ||
kl.keyboard_id, tt.base_tag | ||
FROM | ||
t_keyboard_language kl INNER JOIN | ||
t_langtag_tag tt ON kl.bcp47 = tt.tag |
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only change in the search-prepare-data*.sql series; was: