Django: carácter no ASCII

My Django View/Template is not able to handle special characters. The simple view below fails because of the ñ. I get below error:

Non-ASCII character '\xf1' in file"

def test(request):
    return HttpResponse('español')

Is there some general setting that I need to set? It would be weird if I had to handle all strings separately: non-American letters are pretty common!

EDITAR This is in response to the comments below. It still fails :(

I added the coding comment to my view and the meta info to my html, as suggested by Gabi.

Now my example above doesn't give an error, but the ñ is displayed incorrectly.

Lo intenté return render_to_response('tube/mysite.html', {"s": 'español'}). No error, but it doesn't dislay (it does if s = hello). The other information on the html page displays fine.

I tried hardcoding 'español' into my HTML and that fails:

UnicodeDecodeError 'utf8' codec can't decode byte 0xf.

I tried with the u in front of the string:

SyntaxError (unicode error) 'utf8' codec can't decode byte 0xf1

Does this help at all??

preguntado el 08 de enero de 11 a las 17:01

What is the actual error that you are getting? Is it UnicodeDecodeError? -

Which version of Django are you using? It's not 0.96 is it? -

You need to ensure that your editor is saving the file with the encoding that you've specified. -

7 Respuestas

Do you have this at the beginning of your script:

# -*- coding: utf-8 -*-

? ...

Mira esto: http://www.python.org/dev/peps/pep-0263/

EDITAR: For the second problem, it's about the html encoding. Put this in the head of your html page (you should send the request as an html page, otherwise I don't think you will be able to output that character correctly):

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Respondido el 08 de enero de 11 a las 21:01

I updated the problem description based on your comments. It still doesn't work... - dkgirl

return HttpResponse('string') is rough...it would be useful to use an HTML template where you can include the above meta, and then use render_to_response('mytemplate.html') - Dominique Guardiola

This solves the problem of using special chars on explicit strings in code. Thanks. - Leandro Ardissone

Insert at the top of views.py

# -*- coding: utf-8 -*-

And add "u" before your string

my_str = u"plus de détails"

Resuelto!

Respondido el 24 de Septiembre de 12 a las 18:09

Your answer is a combination of este y este answer, but less complete. Editing other answers to complete them is preferred. And if not, please be at least as complete as the other ones. - Chris Wesseling

You need the coding comment Gabi mentioned and also use the unicode "u" sign before your string :

return HttpResponse(u'español')

The best page I found on the web explaining all the ASCII/Unicode mess is this one : http://www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror

¡Disfrútala!

Respondido el 08 de enero de 11 a las 20:01

It's still not working. Adding the comment (but not the u) gives a different kind of error. The word can now be displayed, but the ñ is replaced by a square or other strange symbol, depending on the browser. When I also include the u it won't work at all. I get "utf8' codec can't decode byte 0xf1 in position 0" - dkgirl

Kit DEFAULT_CHARSET a 'utf-8' en su settings.py archivo.

Respondido el 09 de enero de 11 a las 14:01

the DEFAULT_CHARSET is set to utf-8 by default - Wade Williams

This may have changed during the two years that passed between post and comment. I have taken the liberty to wait two years to say this as well. - random6174

I was struggling with the same issue as @dkgirl, yet despite making all of the changes suggested here I still could not get constant strings that I'd defined in settings.py that contain ñ to show up in pages rendered from my templates.

Instead I replaced every instance of "utf-8" in my python code from the above solutions to "ISO-8859-1" (Latin-1). It works fine now.

Odd since everything seems to indicate that ñ is supported by utf-8 (and in fact I'm still using utf-8 in my templates). Perhaps this is an issue only on older Django versions? I'm running 1.2 beta 1.

Any other ideas what may have caused the problem? Here's my old traceback:
Rastreo (llamadas recientes más última):
File "manage.py", line 4, in
import settings # Assumed to be in the same directory.
File "C:\dev\xxxxx\settings.py", line 53
('es', ugettext(u'Espa±ol') ),
SyntaxError: (unicode error) 'utf8' codec can't decode byte 0xf1 in position 0: unexpected end of data

Respondido el 27 de Septiembre de 11 a las 04:09

Upon study, the only two changes I needed both went into settings.py: gabi's "coding" header (already had the meta tag) and @dominique-guardiola's enforcing unicode interpretation. +1's and thanks for the help! - Ryan

ref from: https://docs.djangoproject.com/en/1.8/ref/unicode/

"If your code only uses ASCII data, it’s safe to use your normal strings, passing them around at will, because ASCII is a subset of UTF-8.

Don’t be fooled into thinking that if your DEFAULT_CHARSET setting is set to something other than 'utf-8' you can use that other encoding in your bytestrings! DEFAULT_CHARSET only applies to the strings generated as the result of template rendering (and email). Django will always assume UTF-8 encoding for internal bytestrings. The reason for this is that the DEFAULT_CHARSET setting is not actually under your control (if you are the application developer). It’s under the control of the person installing and using your application – and if that person chooses a different setting, your code must still continue to work. Ergo, it cannot rely on that setting.

In most cases when Django is dealing with strings, it will convert them to Unicode strings before doing anything else. So, as a general rule, if you pass in a bytestring, be prepared to receive a Unicode string back in the result."

contestado el 05 de mayo de 15 a las 08:05

The thing about encoding is that apart from declaring to use UTF-8 (via <meta> and the project's settings.py file) you should of course respect your declaration: make sure your files are saved using UTF-8 encoding.

The reason is simple: you tell the interpreter to do IO using a specific charset. When you didn't save your files with that charset, the interpreter will get lost.

Some IDEs and editors will use Latin1 (ISO-8859-1) by default, which explains why Ryan his answer could work. Although it's not a valid solution to the original question being asked, but a quick fix.

contestado el 15 de mayo de 15 a las 21:05

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.