Tuesday, September 10, 2013

SQL Server Statistics Part I


In my previous post we learned something about Index seek and Index scan, If the data is highly redundant then Index will not be used, Index will be used only if the data is highly selective enough (that means when query fetches low number of records). If the data is not highly redundant or highly selective then optimizer uses statistics to take decision which one to use like Index scan / Index seek / what Index to be used / full table scan etc.,

Statistics

It is a critical metadata that assists query optimizer to take better decisions. It helps optimizer to do cost based estimation by using system tables and system catalogs by reading how many pages that table has and how many rows that table has. Also it helps in deciding the usefulness of indexes by verifying where clause, order by, group by etc.., based on the information data access algorithms will change to accessing the data.

The purpose of optimizer is to find good plan with fast, not to find good plan.
By seeing the statistics query optimizer will do cost estimation at each step and come up with an efficient plan with lowest cost. So statistics plays important role in preparation of efficient cost based estimate plan so statistics should be accurate and we need to make sure these are up to date. 

This information will be stored in sys.indexes and sys.stats tables in SQL Server. We can 249 non clustered indexes in SQL 2005 and 999 non clustered indexes on SQL 2008, and we can have 2000 stats in SQL 2005 and 10000 stats in SQL 2008 per table and 30000 stats per table in SQL Server 2008 R2.

1. For every index statistics are automatically created
2. We can create statistics on individual columns automatically or manually
3. We can create statistics on multiple columns manually


To update statistics manually
Sp_updatestats (OR) Update Statistics

No comments: