Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate name to request language when available. #1301

Merged
merged 1 commit into from
May 28, 2019

Conversation

mihneadb
Copy link
Contributor

Context: #1296

This makes api choose name.LANG when available. If it isn't possible, it defaults to name.default as it used to.

@mihneadb mihneadb requested review from missinglink and Joxit May 22, 2019 10:42
@mihneadb mihneadb self-assigned this May 22, 2019
Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR, only the property name in the GeoJSON is internationalized.
Most of the time, we are using label property instead of name.
Maybe, this work should be done in changeLanguage midleware and not in geojsonify.

helper/geojsonify.js Outdated Show resolved Hide resolved
@mihneadb
Copy link
Contributor Author

With this PR, only the property name in the GeoJSON is internationalized.
Most of the time, we are using label property instead of name.
Maybe, this work should be done in changeLanguage midleware and not in geojsonify.

Looking at the code, I see both assignLabels and changeLanguage rely on / use name.default. Should I keep this behavior and substitute name.default with name.LANG's value if it makes sense?
Also - assignLabels happens before changeLanguage - when we construct a label like "name + rest_of_address", name should be translated already otherwise we'd get a different language in label than in name.

Thoughts?

@Joxit
Copy link
Member

Joxit commented May 23, 2019

I think you should update the name.default here

res.data.forEach( function( doc, p ){
// skip invalid records

Because changeLanguage should do what it says, change the language.
I think it will be weird to do the job in assignLabels + geojsonify.

@mihneadb
Copy link
Contributor Author

I think you should update the name.default here

res.data.forEach( function( doc, p ){
// skip invalid records

Because changeLanguage should do what it says, change the language.
I think it will be weird to do the job in assignLabels + geojsonify.

Indeed, no question there.
What I'm not sure of - it seems that by the tame we get to changeLanguage, assignLabels has already happened. And it has constructed the label field by including name.default (the non translated one). That would lead to some inconsistencies.

Is my observation correct? And if so, would it maybe make sense to do assignLabels after changeLanguage?

@Joxit
Copy link
Member

Joxit commented May 23, 2019

Nope, changeLanguage is executed before assignLabels 😄

api/routes/v1.js

Lines 306 to 307 in ceaf1f0

postProc.changeLanguage(changeLanguageService, changeLanguageShouldExecute),
postProc.assignLabels(),

@mihneadb
Copy link
Contributor Author

My bad, I was scrolled into the postProc object 🤦‍♂ . Thanks! Addressing.

@mihneadb
Copy link
Contributor Author

@Joxit I updated the code. I don't understand the Travis failure, could you please shed some light? npm test runs successfully on my machine.

@Joxit
Copy link
Member

Joxit commented May 24, 2019

Tests will fail if you do npm update
I think Geolib failed their 3.0.0 release, I opened #1303 to fix this 😄
Don't worry :) you will need to rebase to master when #1303 will be merged

middleware/changeLanguage.js Outdated Show resolved Hide resolved
Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works fine 😄

@mihneadb
Copy link
Contributor Author

@Joxit I rebased on top of the new master.

@mihneadb
Copy link
Contributor Author

@Joxit testing this I realized that placeholder is required to be up for the translation to happen, even though the [services] page claims that for autocomplete you don't need placeholder. Do you see any quick win / way around this?
If not, maybe I could update that doc to say "optional - needed for translations".
Wdyt?

@orangejulius
Copy link
Member

Hey @mihneadb,
This PR looks good in my testing so far. Note that we already use Placeholder for all admin translations, but this PR will help with venues, especially from OSM where they often have translations in many languages. I will give it a bit more testing just to make sure but I think it's good to merge.

The services page does already have a column at the right side noting that Placeholder is required for all admin translations. Maybe we can make it more clear somehow?

@mihneadb
Copy link
Contributor Author

Hey @mihneadb,
This PR looks good in my testing so far. Note that we already use Placeholder for all admin translations, but this PR will help with venues, especially from OSM where they often have translations in many languages. I will give it a bit more testing just to make sure but I think it's good to merge.

The services page does already have a column at the right side noting that Placeholder is required for all admin translations. Maybe we can make it more clear somehow?

You're right, I see it now. Fail on my part. Nvm then :). I didn't read all of the Pelias code unfortunately so maybe this is a stupid question - but looking at changeLanguage.js, it would seem that all the necessary info to translate admin and venue names in in the ES documents themselves, which were imported with name.*. What's the part that I'm missing that's making placeholder required at runtime for translations?

Thanks for the help on this! Glad it turned out to work :).

@orangejulius
Copy link
Member

You're right that all the name translations are present in the Elasticsearch document, but not all of the parent fields.

For example, we have 8 million addresses here in NYC. There aren't really translations for most address records, but if there were, we would store them in each record. However, there are about 100 translations for New York City and those are not stored with each record, or we would have 8 million * 100 = 8 billion copies of the same 100 translations in Elasticsearch. A relational database is much better for that, and that's why Placeholder uses SQLite

@mihneadb
Copy link
Contributor Author

mihneadb commented May 24, 2019 via email

@mihneadb
Copy link
Contributor Author

Is there anything else I need to help with on this or is it just about testing and playing with it some more?

Thanks!

orangejulius added a commit to pelias/acceptance-tests that referenced this pull request May 28, 2019
As of pelias/api#1301, it is now possible to _return translated results_ when searching for venues.

This adds a new test suite of venue translation test cases, and renames
the `placeholder altnames` test case to `admin translations` to match
it.
@orangejulius
Copy link
Member

orangejulius commented May 28, 2019

Hi @mihneadb,
Nothing was needed on your end, just had to find a bit more time to finish testing this.

I've now done so, and am super excited, this is an important bit of functionality and I think it's good to go.

I found one thing that needs to be changed: it's important to only change the name.default value if our language middleware detected a language value either through headers or the lang param, rather than defaulting to English. Without this, any such request will translate all venue names to english, rather than the local language, which was a significant change in behavior.

I've made that change, squashed your commits, updated the commit message to our format that will automatically add this feature to the package changelog, and pushed all that back to your branch so it can be merged easily. I also made some acceptance tests to verify all this behavior over in pelias/acceptance-tests#500.

Thanks for taking the time to look at this!

@orangejulius orangejulius merged commit bb264f7 into pelias:master May 28, 2019
@mihneadb
Copy link
Contributor Author

Right, that makes sense. Thanks a lot for taking care of this!

@mihneadb mihneadb deleted the translate-name branch May 29, 2019 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants