The myth of System.Text.StringBuilder

Lots of people write about how awesome the StringBuilder class is in .net.  And in lots of scenarios it is in fact a very good improvement to the alternatives.  The reality however is that tools are only as good as you use them.  In some cases, the StringBuilder can be even LESS performing then if you were to simply use a String.Concat(string,string), or even string + string.  Lets take a few minutes to explore the under pinning and we can see why.

The StringBuilder class acts as basically a glorified List.  It works by pre-allocating a set of memory so that when you append information to it the memory is already created for you to migrate data into it. If you didn’t do this, the memory would be created for you and then the results would be merged into that space.  It’s second value is that instead of a whole bunch of objects to merge, the StringBuilder will create only a singular object that needs to be garbage collected.  The garbage collection can be a relatively costly expense.

Most cases of StringBuilder are built with an empty constructor.  In this case, the system will allocate a set chunk (16 characters worth) of memory for you.  When you append to the string builder it can possibly do one of two things.  First, it will move the string you have created into it’s memory space.  Second, if your string exceeds that memory space, it will create a whole new memory allocation and copy everything into it.  Off hand, I believe it’s double the size of the initial string.

Think about the string builder as a list.  If I wrote:

List<string> strings = new List<string>(4);

for (int i = 0; i <= 5; i++)
{

strings.Add(“newstring”);

}

What do you think would happen?  The List has been set to hold up to 4 items for you (I think the default is 5 with an empty constructor).  When you place the 5th item in the list, it will double itself in space and copy it’s contents into that new space.  In short, List<string>(4) becomes List<string>(8).  You consume only 5 items in that space, the extra 3 are set to null.

With a list that isn’t a huge deal since it just acts as a pointer to other objects… however if we used the string builder that way, we will negate the primary advantages of the performance gain it offers.

If we look at the same basic problem with a StringBuilder.  If you created a string builder as such:

StringBuilder sb = new StringBuilder();
sb.Append(“b”);
sb.Append(“a”*1000);
sb.Append(“d”*3000);
sb.Append(“idea”*10000);

the end result would be a smaller string builder which then is blow up way out of proportion to it’s original size.  The first string allocated would setup it’s own size + some extra space.  The second time you add to it, it would copy its content into a new space and add just a little bit (determined by the first “b”) extra space.  The third and fourth appending would cause the whole process to start over and over again.

The end result is a lot of memory allocated, but not much of it is actually used.  This also results in lots of garbage collection, which is one of the primary advantages of using the StringBuilder in the first place.  Don’t discount the cost of garbage collection.  While it’s eating up your un-referenced code, it might just eat up your performance.

If there is a reason for a tool being used over others, you should take ownership of the matter and research it yourself.  These tidbits are easily learned with a little bit of time spent in reflector.  Don’t just use these things blindly, because they can easily negate the benefit that people are trying to convince you they offer in the first place.  Take ownership of your tools.

There are far more enlightening texts on the subject here: