Witaj na Zine.net online Zaloguj się | Rejestracja | Pomoc

[EN] Column order - does it matter?

Do you think column order doesn't matter in SQL Server tables? Well, if you say it doesn't I would say the same thing as always: "it depends" :-)

On SQL Server 2005 instance let's create some table with 50 nullable varchar columns:

USE tempdb
GO

IF OBJECT_ID('tempdb.dbo.test_varchar') IS NOT NULL
  DROP TABLE dbo.test_varchar

CREATE TABLE dbo.test_varchar
    (
      id INT PRIMARY KEY IDENTITY(1,1),
      col0 VARCHAR(100) NULL,
      col1 VARCHAR(100) NULL,
      col2 VARCHAR(100) NULL,
      ...
      -- put all other varchar columns here
      ...
      col49 VARCHAR(100) NULL
    )
GO

Now, let's insert some rows:

INSERT INTO dbo.test_varchar(col0)
SELECT 'a'
GO 100000

Notice that only the first varchar column is filled with not null value. Let's see how big the table is now:

EXEC sp_spaceused N'dbo.test_varchar'

The results:

 

name         rows   reserved data     index_size unused
------------ ------ -------- -------- ---------- ------
test_varchar 100000 2440 KB  2376 KB  16 KB      48 KB

 

Now, let's delete all rows and fill the table again:

TRUNCATE TABLE dbo.test_varchar
GO

INSERT INTO dbo.test_varchar(col49)
SELECT 'a'
GO 100000

This time only the last varchar collumn is not null. Again, let's check the size of the table. The results:

 

name         rows   reserved data     index_size unused
------------ ------ -------- -------- ---------- ------
test_varchar 100000 12232 KB 12128 KB 56 KB      48 KB

 

The size is almost five times larger than when the first column was not null! What's going on?

To show the thing I will use some very nice tool - SQL Internals Viewer (of course, I can use DBCC IND and DBCC PAGE commands instead). First I will show the structure of a single row with the first varchar column not null.

This is a common row structure in SQL Server. Nothing special - you can see some NULL bitmap there, some column offset array and so forth. What's the most important you can calculate the size of the row - 22 bytes.

Now I will do the same with the row having only the last varchar column not null.

This time the size of the row is 120 bytes. And if you take a look at the numbers above you should see that 100 bytes is spent on the offset array. So it seems SQL Server stores the offsets for all columns up to the last not null one. That means if you have a table with some number of variable-length columns that will remain null in most cases it can be better to put them at the very end of the column list in a table. This can seem to be not a real life example (some people could say that we should avoid nullable columns) but I think we should consider this behavior when designing our tables (less space means less pages, less pages means less reads, less reads means faster queries).

Now, let's see if there is some answer for this problem in SQL Server 2008 CTP6. I will use sparse columns instead of just nullable ones.

CREATE TABLE dbo.test_varchar
    (
      id INT PRIMARY KEY IDENTITY(1,1),
      col0 VARCHAR(100) SPARSE NULL,
      col1 VARCHAR(100) SPARSE NULL,
      col2 VARCHAR(100) SPARSE NULL,
      ...
      -- put all other varchar columns here
      ...
      col49 VARCHAR(100) SPARSE NULL
    )
GO

Again, I will fill the table like before (only the last column will have not null value).

INSERT INTO dbo.test_varchar(col49)
SELECT 'a'
GO 100000

Let's check the size of the table. The results:

 

name         rows   reserved data     index_size unused
------------ ------ -------- -------- ---------- ------
test_varchar 100000 2632 KB  2576 KB  16 KB      40 KB

 

Wow, I would say. I'm not going to claim that sparse columns are already my favorite feature in Katmai but they seem to deserve a closer look at (what about the performance of inserts and so forth) :-)

Oh, and by the way - the size of the table in SQL Server 2008 is the same if you fill only the first column with not null value.

Credits go to my coleague from Asseco Business Solutions S.A., Jarek Kuśmierek, who has made most of the research shown above.

Opublikowane 6 czerwca 2008 10:15 przez brejk

Komentarze:

# re: [EN] Column order - does it matter?

6 czerwca 2008 12:25 by Marcin Guzowski

Ciekawe. Jak to kiedyś śpiewał Depeche Mode "everything counts in large amounts" :)

# Sparse Column Storage

8 września 2008 19:45 by Collation, DateTime and XML

In SQL Server 2008, we implemented a set of features, including Sparse Column, Column Set + Wide Table,

Komentarze anonimowe wyłączone

About brejk

MVP, MCT, SQL Server geek