Friday, November 16, 2012

SQL Server Data Pages, Extents etc


The fundamental unit of data storage in SQL Server is pages. All the data that stored inside the SQL Server is on data pages. The size of data page is 8Kb. Data is stored in a data file with extension .mdf.  

A page is an 8kb (8192 bytes) and 132 bytes are used for management and 8060 is the actual data page

1. Page Header (96 bytes)
2. Page Data
3. RowOffset –(36 bytes) It contains address of all records that are stored in data page.

The disk space is logically divided into pages and the pages are numbered contiguously from 0 to n. Disk Input and output operations are performed at page level.

Pages are grouped together into extents. An extent is a collection of 8 physical contiguous pages. An extent size is 64 kb. Data rows cannot span across pages.

There are different types of pages
Data Page -- Data rows with all data, except text, ntext, image, varchar(max), varbinary(max), and xml data.
Index Page -- Index Entries
Text / Image -- Large object data types : text, ntext, image, varchar(max), xml etc.,
Variable length columns when data row exceeds 8k ie., varchar, nvarchar, varbinary and sql_variant
Global Allocation Map, Shared Allocation Map -- Information whether extents are allocated.
Page Free Space -- Information about page allocation and free space available on pages.
Index Allocation Map – Information about extents used by a table or index per allocation unit.
Bulk Changed Map – Information about extents modified by bulk operations since the last Backup log statement per allocation unit.
Differential Changed Map – Information about extents that have changed since the last backup database statement per allocation unit.

Rows cannot span across pages, the maximum amount of data stored in a row is 8060 bytes, when a total row size exceeds 8060 bytes, SQL Server will dynamically move large size columns into ROW_OVERFLOW_DATA allocation unit of other page and maintains a 24 byte pointer on a original page in the IN_ROW_DATA allocation unit. This is done whenever Insert or update operations increase size of row beyond 8060 bytes. SQL Server will dynamically move columns back to data pages when size decreases.

Extents
An extent is 8 physically contiguous pages. There are two types of extents a) Uniform Extent and b) Mixed Extents

Uniform Extents are owned by a single object.
Mixed Extents are owned and used by different objects.

GAM & SGAM allocation maps are used for extent allocation.
GAM (Global Allocation Map) – It uses 1 bit to tell whether an extent is free or not. If the bit is 1, then extent is free, if the bit is 0 then extent is allocated. Each GAM covers 64000 extents, almost 4 GB of data.
SGAM (Shared Global Allocation Map) – It uses 1 bit to tell whether a mixed extent and also have at least one free page, If the bit is 1 then it is a mixed extent and has one free page, if the bit is 0 then extent is not used as mixed extent or it is a mixed extent and all pages are being used. Each SGAM covers 64000 extents, almost 4 GB of data.

PFS (Page Free Space) – PFS records the allocation status of each page, whether the page has been allocated or the amount of free space left. The PFS has one byte for each page, recording whether the page is allocated and if so, whether it is empty or 1 to 50 percent full, 51 to 80 percent full or 96 to 100 percent full.

After an extent has been allocated to an object, the database engine uses PFS to record which pages in the extent are allocated or free. This information is used when the database engine has to allocate new page. DB Engine uses PFS to find a page with free space for newly inserted row.
A PFS is the first page after the file header in a data file (page number 1), and GAM (page number 2), and then SGAM (page number 3).


SQL Server uses two internal data structures to track extents that are modified by bulk copy operations and extents modified since last full backup.

DCM – Differential Changed Map – This tracks the extents that have changed since the last backup database statement. If the bit for an extent is 1 then it is modified, if it is 0 then not modified. This will reduces that number of pages that differential backup must scan. The length of time that a differential backup runs depends on the extents that are modified since last backup but not size of database.
BCM – Bulk Changed Map – This tracks the extents that are modified since the last backup log statement. If the bit is 1 then the extent is modified by bulk logged operation, if it is 0 then it is not modified.

No comments: