Posted By

mandric on 04/26/08


Tagged

Bash script scraper


Versions (?)

Scrape Gummy


 / Published in: Bash
 

It all starts with gummy-stuff.

  1. #!/bin/sh
  2.  
  3. URL='http://www.gummy-stuff.org/Yahoo-data.htm'
  4. MY_PWD=`pwd`
  5.  
  6. #lynx -dump -source $URL > ../files/Yahoo-data.htm
  7. cd ../files/
  8. # get new gummy data if it changed
  9. wget -N $URL
  10. cd $MY_PWD
  11.  
  12. #cat ../files/Yahoo-data.htm | perl -ne 'if (s/<B>(\S+)\s*<\/B>\s*<\/TD><TD><font\sface=times\s\S+>\s*(.*)<\/TD>.*/\1 \2/) { print }'
  13.  
  14. # just vars
  15. cat ../files/Yahoo-data.htm | perl -ne 'if (s/<B>(\S+)\s*<\/B>\s*<\/TD><TD><font\sface=times\s\S+>\s*(.*)<\/TD>.*/\1/) { chomp; print }' > ../data/yahoo_arg_string.txt

Report this snippet  

You need to login to post a comment.