Return to Snippet

Revision: 68289
at December 26, 2014 08:33 by tionazo


Initial Code
import urllib2
import re

#connect to a URL
website = urllib2.urlopen(url)

#read html code
html = website.read()

#use re.findall to get all the links
links = re.findall('"((http|ftp)s?://.*?)"', html)

print links

Initial URL
http://www.pythonforbeginners.com/code/regular-expression-re-findall

Initial Description
Get all links from a website 
from: http://www.pythonforbeginners.com/code/regular-expression-re-findall

Initial Title
Get all links from a website

Initial Tags
regex, python, web

Initial Language
Python