Managing Unstructured Data in SQL Server
Prior to SQL Server 2008, storing and managing unstructured data is a bit difficult one, for that we have only two options varbinary and image fields to store unstructured information. Though it provides transactional consistency it badly hurts the performance. The other approach many people follow is storing the path of unstructured data in the table, it gives good performance but it has its own side effects like it doesn’t ensure transactional consistency and it is difficult while doing backup and restore we need to make sure the physical paths should available in the target server.
FILESTREAM -- A new feature introduced in SQL 2008 which allows you to store BLOBS (word documents, images, pdf files) in the NTFS file system instead of database. As it is NTFS SQL Server takes an advantage of streaming capabilities and maintains transactional consistency of the data. While accessing FILESTREAM it uses streaming API and it will not use SQL Sever buffer pool, hence it will not SQL Server memory for processing. FILESTREAM is not a datatype and it is a attribute assigned to varbinary(max) column, then it becomes filestream enabled column.
If you store any image or pdf on the filestream enabled column then only pointer to the file will be stored in the column and the actual file will get stored in the NTFS.
By default installation this feature will be in disable state, In order to use this feature it has to be enabled.
To enable FILESTREAM and It can be enabled in three levels :
- Configuration Manager - SQL Server - Properties - On FileStream Tab - Check the below items
- Enable FileStream for Transact-SQL Access
- Enable FileStream for file I/O Access
Windows Share NameSpecify Share Name SQL2012_FileTables
- Allow remote clients access to filestream data
Enable FileStream for File I/O Access -- If you are interested in enabling read and write FILESTREAM data from windows then we need to check this check box.
Allow remote client access to filestream data -- If remote clients need to access the file share using their stored procedure or application this feature must be enabled.
Level: Level of FILESTREAM functionality to enable, a integer between 0 and 3. If not specified, level 3 is default.
0 - FILESTREAM functionality disabled.
1 - Enable FILESTREAM for T-SQL access.
2 - Enable FILESTREAM for T-SQL and file I/O streaming access but do not allow remote clients file I/O streaming access.
3 - Enable FILESTREAM for T-SQL and file I/O streaming access and allow remote client streaming access.
Share: Name of windows share that is created for file I/O streaming access to FILESTREAM data. If not specified, instance name is default.
The major difference between FILESTREAM enabled and a normal database is
For a normal database we will be having only MDF and LDF files
For a FileStream enabled database will be having MDF, LDF and FileStream Container
FILESTREAM Container is a special folder in the NTFS where SQL Server will store all the FILESTREAM data as files in the disk.
CREATE DATABASE FileStreamTest ON
PRIMARY ( NAME = FSTreamData, FILENAME = 'C:\Temp\FStreamData.mdf'),
FILEGROUP FStreamFolder CONTAINS FILESTREAM( NAME = FStreamFolder, FILENAME = 'C:\Temp\FStreamFolder')
LOG ON ( NAME = FStreamLOG, FILENAME = 'C:\Temp\NP\FStreamLOG.ldf')
In the above statement 3rd line on CREATE DATABASE Statement shows FILSTREAM Container, which creates a folder in C:\Temp with name FStreamFolder, Under which there are two folders $FLOG and filestream.hdr, The $FLOG is similar to transaction log file in the database and filestream.hdr consists metadata information used by SQL Server, (Please don’t make any changes to that file).
Now Creating a table with FileStream Enabled column
Create table CARS (CARID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,
CARIMAGE varbinary(max) FILESTREAM Null)
For every FILESTREAM enabled table you should create a column with UNIQUEIDENTIFIER ROWGUIDCOL.
After executing the above command a new table will get created on the FileStreamTest Database and if you observe the c:\temp\Fstreamfolder a new folder will be created with GUID as its name, SQL Server will creates one folder per each table when there is FILESTREAM enabled column exists.
To Store and retrieve information from a FILESTREAM Enabled column
-- Variable to Store ImageData
DECLARE @img AS VARBINARY(MAX)
SELECT @img = CAST(bulkcolumn AS VARBINARY(MAX)) FROM OPENROWSET(BULK 'C:\temp\tulip.jpg', SINGLE_BLOB ) AS x
INSERT INTO CARS(CARID, CARIMAGE)
SELECT NEWID(), @img
select * from CARS
Select carid, CARIMAGE.PathName() from CARS
From SQL Server 2012
In SQL Server 2012 FileTable concept was introduced and it will get more simplified than FILESTREAMs, File table is a feature newly introduced in SQL Server 2012 is an enhancement of FILESTREAM which was introduced in SQL Server 2008.
FileTable is a special where you need not to specify column definition while creating a filetable, the schema for the filetable is fixed.
Below is the syntax for creating FILETABLE in SQL Server 2012
CREATE DATABASE FileTeableDemo
PRIMARY ( NAME = filetable_data,FILENAME = 'C:\temp\filetable.mdf'),
FILEGROUP FileTableFG CONTAINS FILESTREAM(NAME = FileTableDemo, FILENAME = 'C:\temp\FileTableDemo')
LOG ON ( NAME = filetable_log,FILENAME = 'C:\temp\filetable.ldf')
WITH FILESTREAM( NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME = 'FileTableFolder')
SELECT * FROM sys.database_filestream_options WHERE database_id=DB_ID('FileTeableDemo')
CREATE TABLE MyDocuments AS FileTable
CREATE TABLE MyImages AS FileTable
To Insert any records into FILETable that can be done in two ways: 1. Using T-SQL Like we did for Filestream and 2. Right Click on FileTable on which you want to insert records - Explore file table - Copy and paste images into it.
select * from mydocuments
select * from myimages