Saturday, February 21, 2015

In Memory Tables vs Table Variables - Comparison

Introduction
Recently there was a casual discussion with one of my colleagues who was asking me regarding the newly introduced memory-optimized tables in SQL Server 2014. The discussion went through few aspects of memory-optimized tables and their comparison against table variables. This blog summarizes the points that came out of the discussion. Thought of sharing it for the benefit of larger audience

What are memory optimized tables?
Memory-optimized tables are special type of table which resides in main memory. They got introduced in SQL Server 2014 version and is a method of implementing in-memory OLTP. They are transactional in nature and can preserve durability if required. Another important aspect of memory-optimized tables are that rows support versioning. This means that at any point of time the table preserves multiple earlier versions of its rows.
Another important feature of these tables are that you can make them durable which is the default behavior or keep them as non durable in which case the data doesn't get persisted in disk. This will help you to save some disk IO, but at the expense of data loss in case of a server crash or restart.
The memory-optimized tables can be accessed using normal adhoc T-SQL statements or using natively compiled stored procedures.
What is a table variable?
Table variables are variables of table data type. Tables variable reside in main memory until they get swapped out to disk for space. They also does not store distribution statistics information within them. They also cause no recompiles. Tables variables also does not allow modification using ALTER TABLE. Explicit indexes cannot be created in table variables until SQL Server 2014
 Now that we have seen what a memory optimized table and table variables is we will now see some differences in behavior between them

Differences between memory optimized vs table variables

Scope
When you consider scope memory optimized tables have scope same as actual tables with a slight change. The memory optimized tables will be available across all connections. They come in two flavors - durable (default) and non durable. In the case of durable tables, they preserve the schema as well as the data as they're persisted to the disk whereas in the case of non durable tables only the schema is persisted. This means that during a server restart the tables gets recreated but without any data in them. The durability can also be made fully durable or delayed durability with the latter having a risk of causing some data loss in case of an intermediate server restart/failure.
In contrast the table variables have scope only within a batch in which they're declared. This is evident from the below example

declare @t table

(
id int,
val varchar(10)
)
insert @t
values (1,'Test'),
(2,'jhgjgu')

select * from @t
go

SELECT * FROM @t

The first select works whereas second one does not as its outside the batch in which table variable is created (GO represents the end of that batch)

Tempdb Usage
Memory-optimized tables will reside in main memory itself.So they wont use up any tempdb resources at all!
On the other hand table variables will be in main  memory only until the space requirements is within the  available space of cache. Once there is a requirement for extra space it utilizes the tempdb. Hence while using table variables for large datasets you would see higher usage of tempdb.

Transaction Support
Memory-optimized provides transaction support similar to normal tables. They have two ways of handling transactions based on whether we want it to be fully durable or delayed durable. The former will cause the changes made by the transaction to be persistent whereas the latter would be durable   only after the in-memory transaction log is persisted to disk.
Table variables on the other hand doesn't support transactions. This is very evident from the below example snippet

begin tran testtemptabletran
declare @t table

(
id int,
val varchar(10)
)
insert @t
values (1,'Test'),
(2,'jhgjgu')

rollback tran testtemptabletran
select * from @t

Table Statistics
Memory-optimized tables store statistics info just like normal tables. But there is a difference in the way statistics gets updated for memory-optimized tables against normal disk persisted tables. Whilst disk based tables update statistics automatically in the case of memory-optimized tables we have to do this manually. We need to make use of CREATE STATISTICS statement for this purpose. We need specify  NORECOMPUTE clause while creating/updating statistics for memory-optimized tables as they does not update the statistics themselves. We should also specify FULLSCAN mode as memory-optimized tables only support Fullscan statistics which is a time consuming process as against disk based tables which uses sampled statistics as the default option.
In comparison table variables store only limited distribution statistics. This is why quite often you find wrong cardinality estimates in the execution plan when query involves table variables.

So as seen from above points both have their own pros and cons.
The main advantage of memory optimized tables are that they are in main memory itself unless you want to persist data to disk. So question is whether these advantages can be coupled with table variables to get more advantage. Fortunately we have a way of doing this.

The Hybrid Approach - Using Memory Optimized Table Type Variables
For creating memory optimized table variables what we can do is to create a table type which is memory-optimized. Once this is done we can create a variable for this table type which will be memory-optimized in nature. This will give merge the advantages of both tables variables as well as memory-optimized objects.
The main among them are

  • The memory-optimized table type variables never get swapped to disk as they are now memory-optimized. As such there is no tempdb usage and associated IO costs
  • The memory-optimized table type variables also take advantage of memory-optimized algorithms and can be accessed using natively compiled procedures for more efficiency
  • One more advantage of memory-optimized table type variables over memory optimized tables is that since they're non transactional in nature they support cross db queries whereas normal memory-optimized tables will not allow cross db access. This comes handy especially in the case where you want data to copied across multiple db tables where memory-optimized table type variables can be used as a medium for data transfer
We can create memory-optimized table type variable using below code

CREATE TYPE MEMOPT_TT AS TABLE(
      
      Col1 [int] NOT NULL,

      Col2 varchar(100) NOT NULL,

      INDEX [IX_Col1] HASH (Col1)

            WITH ( BUCKET_COUNT = 8)
)

WITH ( MEMORY_OPTIMIZED = ON )

DECLARE @od MEMOPT_TT

Now you can use this variable in dml operations like below

INSERT  @od   VALUES

      (123,'Test'),

      (345,'Blah blah')
SELECT * FROM @od 


Summary
From the above points we can clearly see the advantages of conventional table variables vs the new memory-optimized tables usage. This also gives us some insights into how the two concepts can be merged to get a hybrid solution which merges the advantages of each.

References
https://msdn.microsoft.com/en-IN/library/dn535766.aspx?f=255&MSPPError=-2147217396
http://www.brentozar.com/archive/2014/06/temp-tables-table-variables-memory-optimized-table-variables/
http://www.sqlskills.com/blogs/bobb/sql-server-2014-memory-optimized-table-variables-metadata-and-garbage-collection/