GitHub - giantoak/image-lookup-two: A second version of the image lookup repository.

A Flask app for conducting reverse image search using Google's API.

Sample usage:

conda create --name image-lookup --file requirements.txt to create the program environment
source activate image-lookup to enter the environment
python run.py to start the server.
In a second terminal, start running ngrok (ngrok http 8080) or your html-interface-exposing tool of choice. Alternately, run the program as a dedicated web app on a specific web server.
Go to 0.0.0.0:8080 - or your publicly exposed http endpoint- in your browser of choice.
Use the file browser to upload an image.
Browse the gallery of results

The process for querying Google Reverse Image Search (as of December, 2015)

There are two ways to access image search:
Send a link to a hosted version at the end of the query: http://images.google .com/searchbyimage?image_url=[image url]
Send a copy of the image to google directly: http://www.google.com/searchbyimage/upload with an encoded_image field containing the raw image data and an image_content field containing an empty string.
Either way, before connecting to search you must create a User-Agent header for a modern browser.
Google will respond with a redirect to a results page (in a Location field).
This page will include results, plus a tag linking to a much larger, "visually similar images" results page: <a href="[url]">Visually similar images</a>. Extract this tag, and request the linked page.
This final page will contain a number of results for us to extract, generally conforming to the schema.org's Search Results Page standard:
<a class="qb-b">[query text]</a>, if google has a search query associated with this particular image. (This isn't likely, unless we're searching on something particularly popular.
All results are nested within tags of the form <div class="rg_di rg_el ivg-i"> Of particular note within are these three tags: 1. <a class="rg_l" href="/imgres?imgurl=[url of image]&imgrefurl=[website of image]& h=[height]&w=[width]&tbnid=[google internal id]&…&…" …> 1. <span class="rg_ilmn">[resolution] - [image domain]</span> 1. <div class="rg_meta">{JSON dict of various features}</div>
1. pt: Some type of heading or subject string. I'm uncertain if this key is mandatory.
2. s: An additional subject string. It may or may not be the same as pt. I'm uncertain if this key is mandatory.
3. id: A repeat of the tbnid value mentioned above.
This is very brittle, but within the eighth <script> tag google stores thumbnails of the first ten images, keyed with the tbnid value. If you can get to the right portion, you can extract the raw image.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
.gitignore		.gitignore
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sample usage:

The process for querying Google Reverse Image Search (as of December, 2015)

About

Releases

Packages

Languages

giantoak/image-lookup-two

Folders and files

Latest commit

History

Repository files navigation

Sample usage:

The process for querying Google Reverse Image Search (as of December, 2015)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages