Reemplace palabras del diccionario con un enlace usando expresiones regulares

So i have this html:

<img src="images" alt="alt" />
alt <a href ="http://google/something">alt</a>
test hallo world monkey
<p>alt</p>

and a dictionary containing

{alt, test, hallo, world, monkey, something}

so i need a regex or another method to replace words that are not within a A tag or a img tag I have tryed the following regex:

(?<![a-zA-ZåøæÅØÆ])alt(?![a-zA-ZåøæÅØÆ])^*(?!=)$

http://rubular.com/r/p52ezGmVHO

preguntado el 31 de julio de 12 a las 10:07

Wouldn't this also replace the alt in your cool emoticon image too, since it has an 'alt' attribute? -

2 Respuestas

You could use regex and do a negative lookahead and lookbehind for letters:

(?<![a-zA-Z])keyword(?![a-zA-Z])

in your example this would look like this:

bodyText = Regex.Replace(bodyText, "(?<![a-zA-Z])" + article.headword + "(?![a-zA-Z])", "<a class=\"dic\" href=\"#\">" + article.headword + "</a>");

My first intend was to do a positive search for whitespace characters, but then I thought of punctuation and stuff like this, a keyword is still a keyword if it has a .,!? at the end, right? So lookaheads and lookbehinds essentially check if something preceeding or succeeding your keyword, without replacing these, too.

Respondido 31 Jul 12, 11:07

Esto es lo que terminé haciendo

var regex = new Regex("(?<![a-zA-Z" + SpecialChars + "])" + article.headword + "(?![a-zA-Z" + SpecialChars + "])+(?!==)");

bodyText = regex.Replace(bodyText, "<a href=\"#dic\">" + headword + "</a>");

This will only replace the first one

Respondido 19 Abr '13, 15:04

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.