-
-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use "name:en" as a fallback for when no default name was found #498
Conversation
@missinglink great stuff for adding this! I was wondering, in the context of pelias/api#1301, wouldn't it make sense to import all available names in the docs? |
a8f0acb
to
7db6842
Compare
@mihneadb we're already doing that in https://github.com/pelias/openstreetmap/blob/master/stream/tag_mapper.js#L37-L45, so it's already set up for allowing us to query for various languages at runtime :) The issue highlighted by #497 is that the address is available as GID |
Our code requires a Having said that, it might be something we need to change in the future to improve i18n. |
Thanks for the insight. That still means that if a place has only a |
Unfortunately yes, the order of priority for
Do you think it would be best to add another level of fallback which accepts any other set language in the case all of these fail? |
We could also check if the |
This PR looks good as is. Another thing we might consider is this, for essentially item 6 on the order of priority above: If there is any single So for example if there is only |
What about if I change the PR to this: // Handle the case where no default name was set but there were
// other names which could use as the default.
if( !doc.getName('default') ){
var defaultName =
doc.getName('official') ||
doc.getName('international') ||
doc.getName('national') ||
doc.getName('regional') ||
doc.getName('en');
// use one of the preferred name tags
if ( defaultName ){
doc.setName('default', defaultName);
}
// else just select the first available name tag
else {
var tags = Object.keys(doc.name)
if ( !!tags.length ){
doc.setName('default', doc.getName(tags[0]));
}
}
} I ran the end-to-end tests and it didn't catch any other cases, but since it's only covering Vancouver (because its small and dual-language), but this issue might be more exaggerated in other parts of the world. |
I don't really like the idea of using the first name tag. That would effectively prioritize languages in alphabetical order, which doesn't really make sense. It's hard to evaluate the code snippet above, since it's not a diff. How about we merge this PR and then discuss refactoring/other improvements in their own PR? |
Yea, picking the first name will either favour tags by lexical sorting or insertion order, which isn't very elegant 😟 In a rare case where none of the preferred tags are available and Since its fairly rare (I'm assuming), we might be better off selecting any name rather than no name, just so the record isn't discarded during import. Is there possibly another algorithm we could use to prioritize languages? |
+1
How about trying the language code which matches the country of the place? This needs a map of official languages for each country code though. It would have to be sorted by language popularity too (eg Switzerland has 4 languages) I guess this could also be filled as a separate enhancement request. It's lower priority and shouldn't block this PR. |
okay, here are some stats I generated today from a planet build.
Adding the first-level fallback (the one which considers For the second-level fallback (the contentious bit where we select one of the names prefixed with a language code), we could import an additional 2941 records, of which the split is:
some examples: 3x name:** fields
2x name:** fields
|
So I'm going to rebase this PR to solve this issue for the 11587 'simple cases' and also the 2744 'unambiguous' cases. I'm not going to try and tackle the 197 'ambiguous' cases because I don't have time for it, although I'd be happy if someone else would like to address them, I've added examples above to aid with this task. |
1bf3664
to
5a61431
Compare
Nice analysis. Looks like a really clean and good solution and thankfully the data shows we don't have to make any difficult choices this time :) |
var keys = Object.keys(doc.name).filter(n => n.length === 2); | ||
|
||
// unambiguous (there is only a single two-letter name tag) | ||
if ( keys.length === 1 ){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pursuing this! Looks like the best solution.
resolves #497