Posted By

rowntreerob on 11/03/09


Tagged

perl bookmarks linkcheck


Versions (?)

run 'linkcheck' on bookmarks 1 of 2


 / Published in: Bash
 

URL: http://search.cpan.org/~marclang/ParallelUserAgent-2.57/lib/LWP/Parallel.pm

export bookmarks.html file, extract links from that file to a list , using the list as INPUT to linkchecker that reports dead links

  1. grep 'HREF\=.http' bookmarks.html | \ # get the links or hrefs from the file
  2. awk '{print $2}' | \ # save just that word w/ link
  3. sed 's/......//' | \ # remove prefix
  4. sed 's/"$//' | \ # remove suffix
  5. grep -v https | \
  6. grep -v mozilla > linkcheck_in # save file for step 2 INPUT
  7.  
  8. perl check_links_1.pl < linkcheck_in # STEP 2 , run the perl linkcheker
  9.  
  10. <<STEP 2 STDOUT>>
  11.  
  12. Answer for 'http://www.warnerbros.com/hipclips/' was 404: Not Found
  13. Answer for 'http://www.informit.com/articles/article.aspx?p=353736&seqNum=4&rll=1' was 200: OK
  14. Answer for 'http://www.cs.washington.edu/homes/amp/opine/emnlp05_opine.pdf' was 403: Forbidden
  15. Answer for 'http://java.sun.com/javase/technologies/desktop/javawebstart/index.jsp' was 200: OK
  16. Answer for 'http://www.sfgate.com/eguide/' was 200: OK
  17. Answer for 'http://oedb.org/library/college-basics/invisible-web' was 200: OK
  18. Answer for 'http://linuxmafia.com/bale/' was 200: OK
  19. Answer for 'http://www.oracle.com/index.html' was 200: OK

Report this snippet  

You need to login to post a comment.