I'm using beautifulSoup to scrape a page that has a ISO-8859-1 encoding however I've run into my little hiccup.
I have a line that reads:
logging.info("Processing [%s]" % (link))
link is one of the values scraped from beautifulsoup. It is a Unicode string and I can print it by typing
print link. It shows up on the console exactly the way it was scraped but the line above throws this error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 14: ordinal not in range(128)
I've read up on Unicode right now but I can't figure out why it is able to print it but it can't log it.
The string in question is this:
Any ideas on where I'm mucking this up?
Gracias por su atención.
preguntado el 02 de febrero de 12 a las 10:02
logging no le gusta
unicode; pass it bytes.
logging.info("Processing [%s]" % (link.encode('utf-8')))
I managed to solve this by adding a file called
sitecustomize.py en mi
Python/Lib/site-packages directory. This file contained two lines:
import sys y la
The default encoding prior to that was
ascii and therefore the issues. Now I don't need to specify an explicit encoding for the link variable as it uses the default encoding i.e.
utf-8 and converts it to that.
Of course, I'll never see the characters properly until my terminal in the same encoding but that won't break my code.