Reemplazo de texto en una cadena enorme sin pérdida de memoria

I am currently working on a batch that must generate about 16000 emails in a row (a newsletter).

Either it is spam or not, my question is about how I generate those e-mails.

Some fields in the message must be replaced by custom values (date of the day, name of the user, etc).

For some deadline and code-reusability reasons my template is an HTML file with some "_FIELDNAME" fields that can be easily spotted by a regex :

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
<p>Hi _NAME, _DATE newsletter.</p>

The file is about ~1000 lines so it is quite a big string when loaded.

First thing, I load once the HTML file template in a string :

string template = File.ReadAllText(@"Template/newsletter.html");

And the replacing function looks like this :

return new StringBuilder(template)
.Replace("_DATE", profileConfig.SelectedMonth.ToString("MMMM yyyy"))
.Replace("_NAME", profileConfig.Name)

The problem is that the memory consumption increase slightly over each iteration. It's about 50MB for 1000 iterations, and it's due to my replacing function (I tried to comment it and the memory leaks disappeared).

How can I replace many fields (~50) in my template without overflowing the memory for my 16000 iterations ? I tried a couple a thing, like using Regex (but it's using string) or temporary files but both didn't satisfied me.

Gracias de antemano por su ayuda.

preguntado el 31 de julio de 12 a las 14:07

Why would there be a memory leak here? How do you know that the GC simply did not see the need to collect garbage yet? -

50Mb for 1000 iterations doesn't seem like much. That suggests you'll have hit around 800Mb by the end of your 16000 iterations, and that's assuming you're Derecho about the leak (which I don't think you are). Why is this a problem? -

@MennanKara: There is just one reference to this StringBuilder, and it will get "lost" automatically when the method returns. There's no need to do anything. -

@Jon: The answer on the linked page says exactly the same, no point trying to fix this. -

3 Respuestas

If you can replace your _DATE, _NAME, etc.con {0}, {1}, etc. you can try string.Format()

Template would become:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
<p>Hi {0}, {1} newsletter.</p>

And code would look like this:

return string.Format(template, 
        profileConfig.SelectedMonth.ToString("MMMM yyyy"), 

You actually don't need to go through a StringBuilder at all. You would greatly benefit in speed (and probably in resources usage) if you went File.ReadAllLines() and only swapped values in the lines which contain tokens.

ACTUALIZACIÓN In order to enforce the use of the string.Format(string format, params object[] args) overload you may have to put all your arguments into a collection.

The following should make this solution work for you (I tested it up to 1000 arguments and it's both working y quite fast).

List<string> tokenValues = new List<string> 
    profileConfig.SelectedMonth.ToString("MMMM yyyy"), 
    <follow with your other values>
return string.Format(template, tokenValues.ToArray()); //.ToArray() is mandatory

Respondido 01 ago 12, 08:08

Thanks a lot, I'm going to try those 2 solutions and will give you the results. - Valryon

The string.Format solution doesn't work, there is too many arguments (more than 50) and the formatter just fail. Maybe it's not liking HTML. For your second idea, how would you replace the tokens? ReadAllLines returns a string[], so I need to iterate over each line and make a replace? - Valryon

There is a way to "cheat" string.Format() with many arguments. I'm gonna try it with 50+ arguments and edit the answer if it works. - Alex

    var patterns = new Dictionary<string, string>();
    patterns["_Date"] = profileConfig.SelectedMonth.ToString("MMMM yyyy");
    patterns["_Name"] = profileConfig.Name;

    var builder = new StringBuilder(template.Length);
    for (var i = 0; i < template.Length;)
      var pattern = CompareAndFindPattern(template, i, patterns);
      if (pattern != null)
        i += pattern.Value.Key.Length;

  static KeyValuePair<string, string>? CompareAndFindPattern(string template, int index, Dictionary<string, string> patterns)
    foreach (var pattern in patterns)
      if (string.Compare(template, index, pattern.Key, 0, pattern.Key.Length) == 0)
        return pattern;
    return null;

Respondido 31 Jul 12, 14:07

After trying many solution, I finally decided to restart my batch from scratch.

I now use a proper XSLT file to generate the HTML from a XML configuration.

Memory Consumption still increases over time but it is now slower. I guess the garbage collector don't want to collect as my computer has 6GB RAM and no other huge processes to run.

Respondido 01 ago 12, 14:08

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.