Location data #64

gilesknap · 2019-03-06T21:05:18Z

Google has seen fit to not provide location data with the images downloaded by Google Photos API.
I guess I could use beautiful soup and the ProductUrl to scrape this info and insert into our index - plus into the JPG EXIF as required.

gilesknap · 2019-03-09T16:21:33Z

Nice idea, but ...
Accessing the google photos page requires an interactive login token and to get that you need to interact with the login pages using a javascript enabled browser. I had a try at using requests and bs4 but got as far as a page telling me to enable javascript. Now I guess we could run it through a rendering engine somehow but this all seems too messy now.

gilesknap · 2019-03-09T16:42:57Z

Yeah, what I said. But it looks like there may be some solutions out there such as Selenium or phantomJS, see discussion here https://stackoverflow.com/questions/5793414/mechanize-and-javascript

gilesknap · 2019-03-13T19:58:27Z

Made good progress on this. I can now scrape GPS data into the DB index.
I have yet to add it to add this to the JPG files. This is pending the EXIF library having a write capability (see this issue), which I might look at myself. But pausing this project for a little while to work on a new project bookbot.

gilesknap · 2019-05-21T09:53:45Z

I've read that Google will be disabling the use of embedded frameworks to login. I'm pretty sure that this will totally close off the approach I'm using here. See https://www.zdnet.com/article/google-bans-logins-from-embedded-browser-frameworks-to-prevent-mitm-phishing/
So it looks like Google will be keeping our GPS data. I have a sneaking feeling that they don't want to release ALL your data via Google Photos API in case people end up using their backend without them getting to have any influence on what the user sees.

My web scrape approach would still work if I could hand a normal interactive login token to selenium. But that sounds like a security hole, so is probably not possible.

gilesknap · 2019-07-11T12:25:25Z

My location scraping code is no longer working. Either Google has changed the login screen or shut down access to login via embedded frameworks.

gilesknap · 2019-07-11T19:35:30Z

I'm adding a wontfix label to this because Google has shut down all avenues.

I will keep this open, hoping that things might change.

gilesknap · 2019-07-14T08:47:59Z

I have an idea.
Instead of web scraping I should get gphotos to scan a Google takeout download.

I believe Google is obliged to provide a service like takeout due to data protection rules so this should be more sustainable.

I have confirmed that takeout includes GPS.

This would be quite clunky particularly since it requires a full download of your library.

Note that we could also use this to check for modified date and update photos that were edited online.

mholt · 2019-07-15T05:27:59Z

@gilesknap Hey I dunno if this would be helpful but we've been trying to solve the same problem over here: mholt/timeliner#38

And I think the (unfortunate) conclusion we are also arriving at is that integrating with Takeout archives may be the best way to get the location data

Do you think there is a way to correlate media items from the API with a Takeout archive? Some sort of ID that is consistent between the two sources?

gilesknap · 2019-07-15T11:34:11Z

Hi Matt. I do have a plan for a way of doing this.

gphotos-sync has a comparison option used to check the downloaded photos library against a previous backup. It works regardless of where the previous backup comes from so needs to match photos and videos up even if their filenames and folders are different.

I will use this same approach for matching my synchronized library to google takeout files.

The matching primarily uses EXIF UID to ensure that files it is comparing match. When this fails it drops to using dates and filesizes and filenames. Note that create date is extracted from the file's metadata for both videos and images and is reasonably unique on its own (videos don't have a UID unfortunately)

The current file comparison scans over both sets of files and builds a DB table for each. It then runs the sequence of queries in the list 'match' in https://github.com/gilesknap/gphotos-sync/blob/master/gphotos/Queries.py.

I find that this algorithm uniquely matches everything in my 110,000 item library despite the fact that it goes back to 1997 and has many duplicate filenames and images that predate EXIF etc.

HTH, giles.

gilesknap · 2019-07-15T11:37:20Z

UPDATE: I was hoping that it would be possible to only download the 'last' zip file from takeout for an incremental update. But no. The library contents are randomly scattered across zip files.

It makes a bit of a mockery of a nice incremental backup system to have to download the entire library again each time you want to update your GPS info.

gilesknap · 2019-07-24T17:23:36Z

Another possible approach. I had not realized you can paste JavaScript into an active page. This project did so to automate deletion of all photos. https://github.com/mrishab/google-photos-delete-tool. Perhaps this approach can be used to scrape GPS info.

gilesknap · 2019-08-30T07:48:48Z

wrapping these up into #119

gilesknap added wontfix enhancement labels Mar 9, 2019

gilesknap removed the wontfix label Mar 13, 2019

gilesknap added the wontfix label Jul 11, 2019

gilesknap closed this as completed Aug 30, 2019

gilesknap mentioned this issue Aug 30, 2019

Google API Bugs #119

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Location data #64

Location data #64

gilesknap commented Mar 6, 2019

gilesknap commented Mar 9, 2019 •

edited

Loading

gilesknap commented Mar 9, 2019

gilesknap commented Mar 13, 2019 •

edited

Loading

gilesknap commented May 21, 2019

gilesknap commented Jul 11, 2019

gilesknap commented Jul 11, 2019

gilesknap commented Jul 14, 2019 •

edited

Loading

mholt commented Jul 15, 2019

gilesknap commented Jul 15, 2019

gilesknap commented Jul 15, 2019

gilesknap commented Jul 24, 2019

gilesknap commented Aug 30, 2019

Location data #64

Location data #64

Comments

gilesknap commented Mar 6, 2019

gilesknap commented Mar 9, 2019 • edited Loading

gilesknap commented Mar 9, 2019

gilesknap commented Mar 13, 2019 • edited Loading

gilesknap commented May 21, 2019

gilesknap commented Jul 11, 2019

gilesknap commented Jul 11, 2019

gilesknap commented Jul 14, 2019 • edited Loading

mholt commented Jul 15, 2019

gilesknap commented Jul 15, 2019

gilesknap commented Jul 15, 2019

gilesknap commented Jul 24, 2019

gilesknap commented Aug 30, 2019

gilesknap commented Mar 9, 2019 •

edited

Loading

gilesknap commented Mar 13, 2019 •

edited

Loading

gilesknap commented Jul 14, 2019 •

edited

Loading