Posted By

eristoddle on 04/11/12


Tagged

soup pythonpinterestbeautiful


Versions (?)

Who likes this?

1 person have marked this snippet as a favorite

kinerik


Pinterest Scraping with Python and BeautifulSoup


 / Published in: Python
 

This requires: * BeautifulSoup - http://www.crummy.com/software/BeautifulSoup/ * SoupSelect - http://code.google.com/p/soupselect/

  1. def pin_categories():
  2. soup = BeautifulSoup.BeautifulSoup(URL("https://pinterest.com/").download())
  3. cat_list = []
  4. for c in select(soup, ".submenu a"):
  5. cat_list.append(c['href'])
  6. return cat_list
  7.  
  8. def crawl_pin_category(category):
  9. #TODO: find next pages
  10. soup = BeautifulSoup.BeautifulSoup(URL("https://pinterest.com/" + category).download())
  11. return harvest_pins(soup)
  12.  
  13. def harvest_pins(soup):
  14. return [p.find("a",{"class":"PinImage ImgLink"})['href'] for p in select(soup, ".pin")]
  15.  
  16. def grab_pin(pin_id):
  17. soup = BeautifulSoup.BeautifulSoup(URL("https://pinterest.com" + pin_id).download())
  18. return {
  19. "url": select(soup, 'meta[property="og:url"]')[0]['content'],
  20. "title": select(soup, 'meta[property="og:title"]')[0]['content'],
  21. "description": select(soup, 'meta[property="og:description"]')[0]['content'],
  22. "image": select(soup, 'meta[property="og:image"]')[0]['content'],
  23. "pinboard": select(soup, 'meta[property="pinterestapp:pinboard"]')[0]['content'],
  24. "pinner": select(soup, 'meta[property="pinterestapp:pinner"]')[0]['content'],
  25. "source": select(soup, 'meta[property="pinterestapp:source"]')[0]['content'],
  26. "likes": select(soup, 'meta[property="pinterestapp:likes"]')[0]['content'],
  27. "repins": select(soup, 'meta[property="pinterestapp:repins"]')[0]['content'],
  28. "comments": select(soup, 'meta[property="pinterestapp:comments"]')[0]['content'],
  29. "actions": select(soup, 'meta[property="pinterestapp:actions"]')[0]['content'],
  30. }

Report this snippet  

Comments

RSS Icon Subscribe to comments
Posted By: cornmacabre on April 6, 2013

Great post - this could potentially provide a really valuable functionality for SEO & Content Marketing. However, the 'Import BeautifulSoup' stuff seems missing from this script, and I can't seem to get the script to output anything.

I know it's been over a year since you posted this, but do you have any documentation or full .py script for this? I'd really love to get this working and start tweaking it. It provides a great example of the functionality of BeautifulSoup. Thanks!

Posted By: ydv8521 on November 29, 2018

You know about the everything in windows related problems and get windows 10 help support in this site, they are very useful to find out the answers in any windows problems in free of cost and no need to any requirement.

You need to login to post a comment.