/ Published in: Python

Scrapy crawl with goagent agent
I say you goagent list address: http://127.0.0.1:8087
and you create a scrapy project named: myscrapy.
and you pwd is myscrapy
I say you goagent list address: http://127.0.0.1:8087
and you create a scrapy project named: myscrapy.
and you pwd is myscrapy
Expand |
Embed | Plain Text
Copy this code and paste it in your HTML
# file: myscrapy/settings.py ... USER_AGENT = 'http://127.0.0.1:8087' DOWNLOADER_MIDDLEWARES = { 'myscrapy.middlewares.MyProxyMiddleware': 100, 'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110, 'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None, } ... # file: myscrapy/middlewares.py from myscrapy.settings import USER_AGENT class MyProxyMiddleware(object): def process_request(self, request, spider): request.meta['proxy'] = USER_AGENT
Comments
