Wednesday, August 28, 2013

Table Organization in SQL Server (What is Heap)


Table Organization in SQL Server (What is Heaps)

SQL Server uses either the two methods to organize data pages in the partition.

 1. Heap 

2. B-Tree

What is heap?

A heap is a table without a clustered Index, but it can have non-clustered index. Heap has only data not index pages. It has one row in sys.partitions table with IndexId=0. Heaps also have allocation units like IN_ROW_DATA, ROW_OVERFLOW_DATA, LOB_DATA based on the datatypes it is used. The data rows are not stored in a serial order and there is no particular order for data pages also. As there is no index on heaps only Table Scan will perform while searching of any records. If there is no index is defined on a table the search will do a sequential order.

For suppose if you want to search a number 256 in a ID column, when there is no index is the search engine searches for 255 rows then it displays 256th record, to increase the performance some how we need to minimize it that’s what Balanced-Tree does.


/* Index ID = 0 is a Heap Table , Index Id = 1 for Clustered Index and a high value for non-clustered index on Index Id column */

select ss.name as schema_name, so.name as table_name, si.index_id
from sys.indexes si, sys.objects so, sys.schemas ss
where si.object_id = so.object_id and ss.schema_id = so.schema_id
and si.index_id = 0 and so.type='U' 

sp_helpindex Table_Name (from the above result)

As we discussed there is no link between data pages that we can see using the DBCC IND undocumented command

DBCC IND (dbname, tablename, 0) – Observe the below result and ignore the first row as this row is a IAM page row (see pagetype should be 10), and from second row onwards data pages, if you observe clearly there will no data on the PrevPagePID and NextPagePID columns. 

-- To check the fragmentation --

select alloc_unit_type_desc,  index_depth, page_count, avg_page_space_used_in_percent,
record_count, forwarded_record_count
from sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('testindexes'),0, Null, 'detailed') 

Modifications on Heaps

Insert - Added record where there is space

Delete - Removed and marked record as ghost but it won’t de-allocate the pages so when there is any read operation on the table it reads all the pages (For example we have 1500 rows in a heap table which occupied 80 pages and we have deleted 1490 rows then we have only 10 rows left, as this is a heap table it wont de-allocate 80 pages, so while reading the engine has to go thru all the 80 pages which is very expensive operation). To avoid this we need to rebuild the heap.

Update - While updating record if the new record size is higher than the old record then it created forwarded records, it is much expensive while doing any IO operation.

Note –Rebuilding a heap is a very expensive operation when there is a big data on the heap table, because all non-clustered indexes being re-build while rebuilding heap

When there is heap table preferred?

Inserts in to heap table are faster than clustered index table.

No comments: