urllib2 Problema de solicitud

i am trying to open a page using urllib2

 req = urllib2.Request("http://1033kissfm.com",
        headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20100101 Firefox/11.0'})
 response = urllib2.urlopen(req)
 rstPage = response.read()

y la respuesta es

<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx/1.0.3</center>
</body>
</html>

but when i open this url in browser its working fine this is the url

http://1033kissfm.com

in browser it redirects to

http://www.1033kissfm.com/pages/main

.

preguntado el 29 de julio de 12 a las 11:07

This is web-site specific issue, not a python problem. The site probably looks for headers or other information for proof you are using a web browser, not a script. -

As such, your question is too localized for Stack Overflow; an answer will only help you, not anyone else as it cannot be generalized. -

its python issue if its a bad request why its opening in web browser ? -

You'd have the same problem with Perl, or C, or Java. It is an issue with the website, not the python urllib2 library. -

i am sure its not a library issue nor i am here to proof that , but i am sure i don't now something that can make it able to run though python library -

1 Respuestas

i resolved the issue since i think library do not provide any support for handling redirect. this code will help to find redirects for gettting proper response

def get_hops(url):
    redirect_re = re.compile('<meta[^>]*?url=(.*?)["\']', re.IGNORECASE)
    hops = []
    while url:
            if url not in hops:
                hops.insert(0, url)
            response = urllib2.urlopen(url)
            if response.geturl() != url:
                hops.insert(0, response.geturl())
                # check for redirect meta tag
            match = redirect_re.search(response.read())
            if match:
                url = urlparse.urljoin(url, match.groups()[0].strip())
            else:
                url = None
    return hops

Respondido 29 Jul 12, 13:07

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.