Tuesday, September 10, 2013

SQL Server Statistics Part II

Statistics are nothing but a histogram and associated density groups

Density –Uniqueness of values with a set of data, calculated as 1/distinct number of rows.

1/2000 = 0.0005 density

Histogram – Represents the distribution of values for a set of data.

Statistics can be created in different ways

1. Statistics are created when you create any index in SQL Server

To see this, create an index on any table

SSMS – Databases – KalyanDB – (Create some index on Sales Table)

SSMS – Databases – KalyanDB – Tables – Sales – Statistics – There will be a statistics created same as Index name, if you double click on it then it displays statistics and density information.

2. If Auto Create Statistics is set to TRUE then SQL Server optimizer will create statistics automatically on non indexed columns (column statistics starts with _WA_*), And this is a permanent database object in the database until you drop it manually. And we need to remember one thing here Auto Create Statistics doesn’t create multi column stats it creates only single column stats.

SSMS – Databases – KalyanDB – Tables – Sales – _WA_Sys_00000003_0CBAE877 – There will be a statistics created by WA.

3. User can define statistics on columns by using create statistics command

Sp_helpstats ‘Table_name’ -- Displays statistics of columns and indexes on the given table. If you don’t specify ALL parameter it tries to display only column level statistics not at Index level and if there is no Index level statistics it displays “This object does not have any statistics.” 

Sp_helpstats ‘Table_Name’, ‘ALL’ (or)

DBCC Show_statistics(‘TableName’,’StatisticsName’) 

The DBCC Show_Statistics command displays current query optimization statistics for a table or indexed views.

The command returns three different datasets which represents statistics header information, density vector and histogram subsets.

Statistics Header Information contains the below information

Updated -- When the statistics last updated

Rows -- Total number of rows in the table or indexed view when the statistics were last updated.

Rows Sampled – If total number of rows < rows sampled then displayed histogram and density results are estimated based on sampled rows.

Steps -- Number of steps in the histogram, each step spans a range of column values followed by an upper bound column value. The maximum steps are 200.

Density Vector information displays All Density, Average Length and Columns information

Histogram displays distribution of values in the first key column of the statistics object.

When Statistics got updated?

If Auto Update Statistics is set to true then SQL Server optimizer will trigger update statistics only if the 20% of table + 500 rows got updated in the table.

There are two options

Auto Update Statistics – SQL Server updates statistics before generating the execution plan if the statistics are out of date.

Auto Update Statistics Asynchronously – SQL Server uses the out dated statistics to generate the execution plan and then update statistics afterwards.

To view statistics information

Sp_autostats -- Displays or changes the statistics options.

Sp_autostats table_name -- displays tablename, statistics name, and last updated column

Sp_autostats ‘kalyandb.sales’, ‘OFF’ -- Set statistics to Off on sales table

Sp_autostats ‘kalyandb.sales’, ‘OFF’, Index_Name (without quotes)


How do we know that table is using statistics or not?
By verifying the estimated number of rows and actual number of rows from the actual / estimated execution plan. If the numbers are close then statistics are being used by optimizer.

Some times we may need to manually update statistics, for which we can use sp_updatestats or update statistics 

Sp_updatestats -- Update all statistics for the database.

Update statistics table_name -- Updates statistics for all indexes on the given table

No comments: