Dec 072012
 

I guess if you import enough libraries just about anything can be made into a one liner… if you have imported BeautifulSoup, re, requests, and sys, in python3 you can simply do:

print(re.sub(r'^.*imgurl=([^&]+)&.*$', r'\1', str(BeautifulSoup(requests.get("http://images.google.com/search?num=50&hl=en&safe=off&site=&tbm=isch&source=hp&biw=1744&bih=1279&q=%s&oq=" % sys.argv[1]).text).find(href=re.compile("imgurl")))))

To find the first hit on a google image search with argv[1]. Google will probably change their URL images later today and it’ll stop working, but I wanted this for a random task….

E.g. “foo.py one+ring+to+bind+them” currently yields the URL for this beauty:

Or you can wimp out and do it the ez way.

#!/usr/local/bin/python3

#
# search google images for the first match to a word (optionally more than
# one, put together by quotes; returns the URL of the first match.
#
# Usage: $0 name-to-search-for
#
#
# Google image URLs currently look like (for a search for "monkey+breath"):
#
# <a href="/imgres?imgurl=http://amirobyn.com/blog/wp-content/uploads/2009/07/monkeybreath03.jpg&imgrefurl=http://amirobyn.com/blog/%3Fp%3D16&usg=__y39gYotHzJkeYQ2RhxJAkQIbLf4=&h=318&w=620&sz=59&hl=en&start=1&zoom=1&tbnid=bUjKYBdrdgvHSM:&tbnh=70&tbnw=136&ei=kTrCUMzzCs_siQLd0oDwBQ&prev=/search%3Fq%3Dmonkey%2Bbreath%26hl%3Den%26safe%3Doff%26biw%3D1744%26bih%3D1279%26ie%3DUTF-8%26tbm%3Disch&itbs=1"><img src="http://t2.gstatic.com/images?q=tbn:ANd9GcS19-iKGCVUWOdUwzdighrxyDpU3HWLpDPiAcmdPHVDIgDG7U2Y5GAVX70L" alt="" width="136" height="70" /></a>
#
# a quick dispatch to beautiful soup and a substitution and the real URL is yours -
# in this case, at this time:
#
# http://amirobyn.com/blog/wp-content/uploads/2009/07/monkeybreath03.jpg
#

from bs4 import BeautifulSoup

import re
import requests
import sys

if len(sys.argv) == 1 or len(sys.argv) > 2:
    print("Usage: %s image-name-to-search-(can-use-pluses-between-multi-words)" % sys.argv[0])
    exit(1)

# You can do it in one monster line... monster line... monster line....
# print(re.sub(r'^.*imgurl=([^&]+)&.*$', r'\1', str(BeautifulSoup(requests.get("http://images.google.com/search?num=50&hl=en&safe=off&site=&tbm=isch&source=hp&biw=1744&bih=1279&q=%s&oq=" % sys.argv[1]).text).find(href=re.compile("imgurl")))))

#
# or have a prayer of understanding it the usual way
#
url = "http://images.google.com/search?num=50&hl=en&safe=off&site=&tbm=isch&source=hp&biw=1744&bih=1279&q=%s&oq=" % sys.argv[1]

print("Searching for %s" % sys.argv[1])

r = requests.get(url)
soup = BeautifulSoup(r.text)

# find urls that have the imgurl

big_link = soup.find(href=re.compile("imgurl"))
real_link = re.sub(r'^.*imgurl=([^&]+)&.*$', r'\1', str(big_link))
print(real_link)

Sorry, the comment form is closed at this time.