Posted By

whitetiger on 11/09/06


Tagged

regex default python count cli world delicious hello dictionary documentation dangerous


Versions (?)

Who likes this?

1 person have marked this snippet as a favorite

anayhk


Python - Cattura tutti i links


 / Published in: Python
 

  1. import os,re,sys
  2.  
  3. # python script.py file.html
  4.  
  5. links = re.compile('[<].?[Aa].*[Hh][Rr][Ee][Ff].*=.*[\"\']?.*[\"\']?.?[>]')
  6. lunghezza_file = os.stat(sys.argv[1])[6]
  7. f = open(sys.argv[1], 'r')
  8.  
  9. while(lunghezza_file > 0):
  10. riga = f.readline()
  11. lunghezza_file -= len(riga)
  12.  
  13. if links.search(riga):
  14. comparazione = links.search(riga)
  15. output = comparazione.group(0)
  16. links2 = re.compile('http:-*[Zz][Ii][Pp]')
  17.  
  18. if links2.search(output):
  19. output2 = links2.search(output)
  20. print output2.group(0)
  21.  
  22. print 'FATTO'

Report this snippet  

You need to login to post a comment.