SQL Server best practices

Shankar Manamalkav

Microsoft has an awesome product in the shape of MS Sql server that does an extraordinary job of managing data. It is so feature packed that the help file itself runs into several gigs!

Remember however that there is no real substitute for good database design.
Having a killer hardware configuration housing a database that is not well engineered is a recipe for disaster.

Do go over EACH one of your tables with a fine toothed comb and make sure

  1. Every table gets both a primary key and a clustered index. The clustered index should be narrow. Please don’t ‘use’ the clustered index as a sorting mechanism for your data. Use the ‘Order by’ clause in your queries for the purpose. From my experience, ‘date fields’ render themselves very useful as clustered indexes (as most queries on tables involving date fields use date ranges as a filtering mechanism)
  2. Each table can have only one clustered index .So research your tables and applications properly before making a decision. If you are unsure, use the rule of thumb – create a surrogate (autonumber) field to serve as both the Primary key and clustered index.
  3. Use the index tuning wizard and analyze queries and sargable predicates for efficient index creation and use. Your queries should always dictate what indexes you require on your table. Remember Index are NOT required.. they are a performance boosting measure.
  4. Never overdo it. There is such a thing as ‘Over engineering’ a database. DON’T create indexes on binary values or columns that have low selectivity. The overhead in maintaining an index in such cases may defeat their very purpose.
  5. Do setup an automatic DBCC Indexdefrag routine and run it when server load is low.

Why clustered indexes SHOULD be as narrow as possible

The reason for this will become apparent if you recall how the Btree index structure stores data. The major difference between a clustered and non clustered index boils down to what the leaf nodes contain. If a table has a clustered index defined on it, then every Non-clustered index will contain the cluster key in it leaf nodes. So, if you have a wide key, you are essentially wasting storage space.

This also explains why it is computationally more economical to have a sequential surrogate key (autonumber) as the clustered key instead of a natural value (like last name). Picture the case for multiple inserts. Because names statistically tend to peak for certain alphabets (for example, there are more names that begin with ‘A’ than say ‘Q’), it may become necessary to split pages and then balance the trees AND simultaneously update every non clustered index pointer! Clearly, something that can hit performance.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s