Knowledge is Here...: uniqueidentifier [guid] vs int

Wednesday, March 12, 2014

uniqueidentifier [guid] vs int

The uniqueidentifier data type stores 16-byte binary values that operate as globally unique identifiers (GUIDs). A GUID is a unique binary number; no other computer in the world will generate a duplicate of that GUID value.
On other hand int data type store 4-byte value.

A GUID value for a uniqueidentifier column is usually obtained by one of the following ways:

In a Transact-SQL statement, batch, or script by calling the NEWID function.
In application code by calling an application API function or method that returns a GUID.

The uniqueidentifier data type has the following disadvantages:

The values are long and obscure. This makes them difficult for users to type correctly, and more difficult for users to remember.
The values are random and cannot accept any patterns that may make them more meaningful to users.
There is no way to determine the sequence in which uniqueidentifier values were generated. They are not suited for existing applications that depend on incrementing key values serially.
At 16 bytes, the uniqueidentifier data type is relatively larger than other data types, such as 4-byte integers. This means indexes that are built using uniqueidentifier keys might be relatively slower than indexes using an int key.

Cluster-index can be apply on:-

Random GUIDs (using NEWID function)
Sequential GUIDs (using NewSequentialGUID function)
BIGINT identity value
INT identity value

Below is the table for comparing and ranking for above datatypes:

Criteria	GUIDs	Seq. GUIDs	BIGINT	INT
Storage	16 bytes	16 bytes	8 bytes	4 bytes
Insert/Update performance	Slowest	Comparable but the index keys are larger in size. For wider tables, this would be slower than Integer values.	Faster than sequential GUIDs	Fastest
Hotspot contention	Very rare	Yes	Yes	Highest, due to smaller size of RIDs
Fragmentation/Page Splits	High	Minimal	Minimal	Minimal
JOIN Performance/SORT operations	Least performance (Rank 4 = Least performance)	Better than random GUIDs due lesser fragmentation (Rank: 3)	High performance (Rank: 2)	High Performance (Rank: 1)
Logical reads	Rank 4=Highest	Rank 3	Rank 2	Rank 1=Least
Merging data across servers	Beneficial	Beneficial	Difficult	Difficult
Uniqueness	Rare chance of duplicates	Globally unique. Virtually no chance of collisions	Limited by range of BIGINT	Limited by range of INT

The above considerations make the use of GUIDs unfavorable for a clustered index in environments which have large number of queries performing JOIN operations in OLTP and when referential integrity is enforced on the database among multiple tables. Throw non-clustered indexes that you created on the table as covering indexes for the frequently run queries against the database, you can have a significant performance bottleneck.

Source:
http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/guid-vs-int-debate.aspx?Redirected=true

Knowledge is Here...

Pages

Labels

Wednesday, March 12, 2014

uniqueidentifier [guid] vs int

No comments:

Post a Comment