Managing Unstructured Data
in SQL Server
Prior to SQL Server
2008, storing and managing unstructured data is a bit difficult one, for that we have
only two options varbinary and image fields to store unstructured information.
Though it provides transactional consistency it badly hurts the performance. The
other approach many people follow is storing the path of unstructured data in
the table, it gives good performance but it has its own side effects like it
doesn’t ensure transactional consistency and it is difficult while doing backup
and restore we need to make sure the physical paths should available in the
target server.
FILESTREAM -- A new
feature introduced in SQL 2008 which allows you to store BLOBS (word documents,
images, pdf files) in the NTFS file system instead of database. As it is NTFS
SQL Server takes an advantage of streaming capabilities and maintains transactional
consistency of the data. While accessing FILESTREAM it uses streaming API and
it will not use SQL Sever buffer pool, hence it will not SQL Server memory for
processing. FILESTREAM is not a datatype and it is a attribute assigned to
varbinary(max) column, then it becomes filestream enabled column.
If you store any
image or pdf on the filestream enabled column then only pointer to the file
will be stored in the column and the actual file will get stored in the NTFS.
By default
installation this feature will be in disable state, In order to use this
feature it has to be enabled.
To enable FILESTREAM
and It can be enabled in three levels :
- Configuration Manager - SQL
Server - Properties - On FileStream Tab - Check the below items
- Enable FileStream
for Transact-SQL Access
- Enable FileStream for file
I/O Access
-
Windows
Share Name
|
Specify
Share Name SQL2012_FileTables
|
- Allow remote clients access
to filestream data
Enable FileStream
for File I/O Access --
If you are interested in enabling read and write FILESTREAM data from windows
then we need to check this check box.
Allow remote client
access to filestream data -- If remote
clients need to access the file share using their stored procedure or
application this feature must be enabled.
Using T-SQL
Level: Level of
FILESTREAM functionality to enable, a integer between 0 and 3. If not
specified, level 3 is default.
0 - FILESTREAM
functionality disabled.
1 - Enable
FILESTREAM for T-SQL access.
2 - Enable
FILESTREAM for T-SQL and file I/O streaming access but do not allow remote
clients file I/O streaming access.
3 - Enable
FILESTREAM for T-SQL and file I/O streaming access and allow remote client
streaming access.
Share: Name of
windows share that is created for file I/O streaming access to FILESTREAM data.
If not specified, instance name is default.
The major difference
between FILESTREAM enabled and a normal database is
For a normal
database we will be having only MDF and LDF files
For a FileStream
enabled database will be having MDF, LDF and FileStream Container
FILESTREAM Container
is a special folder in the NTFS where SQL Server will store all the FILESTREAM
data as files in the disk.
CREATE DATABASE
FileStreamTest ON
PRIMARY ( NAME =
FSTreamData, FILENAME = 'C:\Temp\FStreamData.mdf'),
FILEGROUP
FStreamFolder CONTAINS FILESTREAM( NAME = FStreamFolder, FILENAME =
'C:\Temp\FStreamFolder')
LOG ON ( NAME =
FStreamLOG, FILENAME = 'C:\Temp\NP\FStreamLOG.ldf')
GO
In the above
statement 3rd line on CREATE DATABASE Statement shows FILSTREAM Container,
which creates a folder in C:\Temp with name FStreamFolder, Under which there
are two folders $FLOG and filestream.hdr, The $FLOG is similar to transaction
log file in the database and filestream.hdr consists metadata information used
by SQL Server, (Please don’t make any changes to that file).
Now Creating a table with FileStream Enabled
column
Use FileStreamTest
go
Create table CARS
(CARID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,
CARIMAGE
varbinary(max) FILESTREAM Null)
For every FILESTREAM
enabled table you should create a column with UNIQUEIDENTIFIER ROWGUIDCOL.
After executing the
above command a new table will get created on the FileStreamTest Database and
if you observe the c:\temp\Fstreamfolder a new folder will be created with GUID
as its name, SQL Server will creates one folder per each table when there is
FILESTREAM enabled column exists.
To Store and
retrieve information from a FILESTREAM Enabled column
-- Variable to Store
ImageData
DECLARE @img AS
VARBINARY(MAX)
SELECT @img =
CAST(bulkcolumn AS VARBINARY(MAX)) FROM OPENROWSET(BULK 'C:\temp\tulip.jpg',
SINGLE_BLOB ) AS x
INSERT INTO
CARS(CARID, CARIMAGE)
SELECT NEWID(), @img
select * from CARS
Select carid,
CARIMAGE.PathName() from CARS
From SQL Server 2012
In SQL Server 2012
FileTable concept was introduced and it will get more simplified than
FILESTREAMs, File table is a feature newly introduced in SQL Server 2012 is an
enhancement of FILESTREAM which was introduced in SQL Server 2008.
FileTable is a
special where you need not to specify column definition while creating a
filetable, the schema for the filetable is fixed.
Below is the syntax
for creating FILETABLE in SQL Server 2012
CREATE DATABASE
FileTeableDemo
ON
PRIMARY ( NAME =
filetable_data,FILENAME = 'C:\temp\filetable.mdf'),
FILEGROUP
FileTableFG CONTAINS FILESTREAM(NAME = FileTableDemo, FILENAME =
'C:\temp\FileTableDemo')
LOG ON ( NAME =
filetable_log,FILENAME = 'C:\temp\filetable.ldf')
WITH FILESTREAM(
NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME = 'FileTableFolder')
GO
SELECT * FROM
sys.database_filestream_options WHERE database_id=DB_ID('FileTeableDemo')
USE FileTeableDemo
GO
CREATE TABLE
MyDocuments AS FileTable
CREATE TABLE
MyImages AS FileTable
To Insert any
records into FILETable that can be done in two ways: 1. Using T-SQL Like we did
for Filestream and 2. Right Click on FileTable on which you want to insert
records - Explore file table - Copy and paste images into it.
select * from
mydocuments
select * from
myimages