Python http get - cannot replicate a curl request with headers -
i have following curl command:
curl -h "user-agent: mozilla/5.0 (windows nt 6.1; wow64; rv:38.0) gecko/20100101 firefox/38.0" -h "accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" -h "connection: keep-alive" -x http://example.com/en/number/111555000
unfortunately not able replicate it... tried with:
url = http://example.com/en/number/111555000 headers = {'user-agent':'mozilla/5.0 (windows nt 6.1; wow64; rv:38.0) gecko/20100101 firefox/38.0', 'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'connection':'keep-alive',} req = urllib2.request(url, none, headers) resp = urllib2.urlopen(req) print resp.read()
but server recognized how request "fake" , forwards me google (reply server is: http/1.1 301 moved permanently
). curl instead receive original page.
any ideas or suggestions? thank dk
edit: additional infos:
$ nc example.com 80 /en/number/111555000 http/1.1 host: example.com http/1.1 301 moved permanently date: fri, 29 may 2015 18:51:05 gmt server: apache x-powered-by: php/5.5.24 location: http://www.google.de content-length: 0 content-type: text/html $ nc example.com 80 /en/number/111555000 http/1.1 host: example.com user-agent: mozilla/5.0 (windows nt 6.1; wow64; rv:38.0) gecko/20100101 firefox/38.0 accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 connection: keep-alive http/1.1 200 ok date: fri, 29 may 2015 18:57:56 gmt server: apache x-powered-by: php/5.5.24 set-cookie: session=a%3a4%3a%7bs... set-cookie: session=a%3a4%3a%7bs... keep-alive: timeout=2, max=200 connection: keep-alive transfer-encoding: chunked content-type: text/html; charset=utf-8 1c6f8 <!doctype html> [...]
with curl:
$curl -x http://example.com/en/number/111555000 $ $ curl -h "user-agent: mozilla/5.0 (windows nt 6.1; wow64; rv:38.0) gecko/20100101 firefox/38.0" -h "accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" -h "connection: keep-alive" -x http://example.com/en/number/111555000 <!doctype html> [...]
i can work requests library. probably better use.
import requests url = "http://example.com/en/number/111555000" headers = {'user-agent':'mozilla/5.0 (windows nt 6.1; wow64; rv:38.0) gecko/20100101 firefox/38.0', 'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'connection':'keep-alive',} req = requests.get(url, headers=headers) req.text
here requests library documentation
hope helps.
Comments
Post a Comment