The neverending story about Guids

08okt 2018 door Jos Menhart

Ok. So, today I came across a performance issue at work that might or might not have to do with the fact that we are using random Guids in a certain table. I decided to accelerate the post I was writing about this, so it could benefit me and my colleagues, and of course, you.

In my days as a SQL Developer and DBA I have come across a lot of performance issues that were caused by these tiny little monsters. To explain, let’s take a dive into the subject:

SQL Server is a great database product with great performance. That is, when you know what (and why) you are doing. Of course, you all know a lot of that performance has to do with indexes, so you want your indexes to be allright. Right?

On the other hand, our developers like being in control of their objects they are using in their code. For that reason it is popular practice to create a bunch of objects in say, C#, link them together by using
Guid myGuid = new Guid();
in their C# code to link objects together.

The problem with this is that when you are saving a lot of these objects in your database, SQL Server does not like that. This has to do with indexes of course, as I was hinting on before.

Indexes are data structures that were made to be in order, like, sequentially I mean. So if you have an index with ascending numbers like {1,2,3,4,5} you will be ok. SQL Server will order these for you inside your index and your queries will be fast.
However, Guids are random by nature, and however useful from a developer perspective, they will destroy the performance of your database over time when you are using them as a key in your search parameters. In my experience this is especially true when using them as join columns.

Here is an example of an index that has been ruined by the inserts of many random Guids:

You can see the fragmentation on that index, which is almost 100%.

Now take a look at what happens when we insert Guids, but with a little tweak. They are just as unique as random Guids, but now they are ordered in an ascending way, helping SQL Server to make use of it:

You can see that the fragmentation has not even reached 1% on this one. So what does this all mean? Let’s do a quick demo with a little tool I just wrote in C#. This tool shows you the performance impact of inserting a few rows into the tables I showed above:

Here I am using two different algorithms to perform 10 inserts into the tables you have seen before. The statistics grid shows that with 10 inserts, the sequential Guids options is more than 3 times as fast as the random (native) one.

If you want to experiment yourself, you can find my code on GitHub.

As a follow-up to this post I would like to include more statistics, like for instance, how join performance is affected. For now, this is it.

If you have any feelings/thoughts/questions about this post I encourage you to leave a comment in the section below and we might get in touch.