php cómo eliminar personajes malvados de la cadena

I am using *file_get_contents* to get some remote text, and the text contains left/right double quoted text such as “Green Slime”.

*file_get_contents* returns this text as �Green Slime�.

Looking at the remote source, the “” characters are literal, not entity codes. There is no character set definition in the source.

Is there a context that I can add to *file_get_contents* to correct this? If not, how can I *str_replace* these characters?

EDIT: Obvious solutions like htmlentities() and str_replace() do not work. I also get the same characters returned when using cURL.

preguntado el 10 de marzo de 12 a las 16:03

How about htmlentities? I am not sure. Have you tried that? -

4 Respuestas

HTML Entities.

This will solve you problem and fix the output.

respondido 10 mar '12, 16:03

htmlentities does not convert them. - user191688

solía ord() to determine that these characters are chr(147) and chr(148), then used str_replace( Chr(147), "&#147", $str ).

Not sure why both file_get_contents and curl return this content in a way that can't be displayed in a browser.

respondido 10 mar '12, 19:03

Put this immediately under the head tag:

<meta charset="utf-8">

Respondido 18 Jul 12, 05:07

I already have that in my page. I Can't change the remote page. - user191688

You can use a string replacer then to replace those parenthesis with some others for example: $s = str_replace('“', '"', $s); echo $s; - sm13294

Look into utf8_decode/encode functions

respondido 10 mar '12, 16:03

UTF-8 isn't the only charset with these quotes. - Ignacio Vázquez-Abrams

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.