convertir DIV a SPAN usando str_replace

I have some data that is provided to me as $data, an example of some of the data is...

<div class="widget_output">
<div id="test1">
    Some Content
</div>
    <ul>
        <li>
            <p>
                <div>768hh</div>
                <div>2308d</div>
                <div>237ds</div>
                <div>23ljk</div>
            </p>
       </li>
        <div id="temp3">
            Some more content
        </div>
       <li>
            <p>
                <div>lkgh322</div>
                <div>32khhg</div>
                <div>987dhgk</div>
                <div>23lkjh</div>
            </p>
        </li>
</div>

I am attempting to change the non valid HTML DIVs inside the paragraphs so i end up with this instead...

   <div class="widget_output">
<div id="test1">
    Some Content
</div>
    <ul>
        <li>
            <p>
                <span>768hh</span>
                <span>2308d</span>
                <span>237ds</span>
                <span>23ljk</span>
            </p>
       </li>
        <div id="temp3">
            Some more content
        </div>
       <li>
            <p>
                <span>lkgh322</span>
                <span>32khhg</span>
                <span>987dhgk</span>
                <span>23lkjh</span>
            </p>
        </li>
</div>

I am trying to do this using str_replace with something like...

$data = str_replace('<div>', '<span>', $data);
$data = str_replace('</div>', '</span', $data);

Is there a way I can combine these two statements and also make it so that they only affect the 'This is a random item' and not the other occurences?

preguntado el 29 de julio de 12 a las 22:07

Not sure if it can handle this specific case, but you might want to look into a library such as Purificador de HTML which is designed to (among other things) convert untrusted (e.g., user-input) HTML into standards-compliant markup. -

Will the errant text always start with "This is a random item", or are you trying to match cualquier <div> dentro de un <p>? -

3 Respuestas

$data = str_replace(array('<div>', '</div>'), array('<span>', '</span>'), $data);

As long as you didn't give any other details and only asked:

Is there a way I can combine these two statements and also make it so that they only affect the 'This is a random item' and not the other occurences?

Aqui tienes:

$data = str_replace('<div>This is a random item</div>', '<span>This is a random item</span>', $data);

Respondido 29 Jul 12, 22:07

Esto conseguirá todo <div> however. Note that only the innermost <div> han sido convertidos a <span> - Michael Berkowski

@Frits van Campen: now it does ;-) - zerkms

@iblue In full agreement with zerkms here. There are lots of times when simple string operations are preferable to DOM manipulation, without resorting to regular expressions in any way. - Michael Berkowski

When I say 'This is a random item' this is just an example, the items are dynamically generated so are all different everytime. Is there a way I can cater for this fact? - peleastarr20

@Vatev: yep, it becomes it solo ahora ;-) Before that it was a valid candidate for being "not overcomplicate" - zerkms

You'll need to use a regular expression to do what you are looking to do, or to actually parse the string as XML and modify it that way. The XML parsing is almost surely the "safest," since as long as the string is valid XML, it will work in a predictable way. Regexes can at times fall prey to strings not being in exactly the expected format, but if your input is predictable enough, they can be ok. To do what you want with regular expressions, you'd so something like

$parsed_string = preg_replace("~<div>(?=This is a random item)(.*?)</div>~", "<span>$1</span>, $input_string);

What's happening here is the regex is looking for a <div> tag which is followed by (using a lookahead assertion) This is a random item. It then captures any text between that tag and the next </div> tag. Finally, it replaces the match with <span>, followed by the captured text from inside the div tags, followed by </span>. This will work fine on the example you posted, but will have problems if, for example, the <div> tag has a class attribute. If you are expecting things like that, either a more complex regular expression would be needed, or full XML parsing might be the best way to go.

Respondido 29 Jul 12, 23:07

The text 'This is a random item' was just in my example to make a point that it was random, I have edited the original post to make it a bit more obvious. - peleastarr20

Is there some kind of wildcard maybe I can use in your example in place of the 'This is a random item'? - peleastarr20

I'm a little surprised by the other answers, I thought someone would post a good one, but that hasn't happened. str_replace is not powerful enough in this case, and regular expressions are hit-and-miss, you need to write a analizador.

You don't have to write a full HTML-parser, you can cheat a bit.

$in = '<div class="widget_output">
(..)
</div>';

$lines = explode("\n", $in);

$in_paragraph = false;
foreach ($lines as $nr => $line) {
    if (strstr($line, "<p>")) {
        $in_paragraph = true;
    } else if (strstr($line, "</p>")) {
        $in_paragraph = false;
    } else {
        if ($in_paragraph) {
            $lines[$nr] = str_replace(array('<div>', '</div>'), array('<span>', '</span>'), $line);
        }
    }
}
echo implode("\n", $lines);

The critical part here is detecting whether you're in a paragraph or not. And only when you're in a paragraph, do the string replacement.

Nota: I'm splitting on newlines (\n) which is not perfect, but works in this case. You might want to improve this part.

Respondido 30 Jul 12, 10:07

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.