Memory Optimized Table – Data Model

Alongside the standard relational engine and its traditional disk-based tables, SQL Server 2014 brought with it a much talked about feature called In-Memory OLTP, a high performance, memory-optimized engine that allows users to create data in tables that reside permanently in-memory.
This is not just another “DBCC PINTABLE”, memory-optimized tables and indexes are completely new data structures and are very different from their disk-based counterparts and they have the potential to improve online transaction processing performance significantly. Let us understand how this is different from traditional disk-based tables by understanding the design goals for this new memory optimized engine.

Design goals for Memory Optimized Engine:

1) Optimize data storage for main memory: Data in In-Memory OLTP is not stored on disk in data pages nor does it mimic an on-disk storage structure when loaded into memory.
2) Eliminate latches and locks: All In-Memory OLTP internal data structures are latch- and lock-free. In-Memory OLTP uses a new multi-version concurrency control (MVCC) to provide transaction consistency. From a user standpoint, it behaves in a way similar to the regular SNAPSHOT transaction isolation level; however, it does not use locking under the hood. This schema allows multiple concurrent sessions to work with the same data without locking and blocking each other and thus improves the scalability of the system allowing to fully utilize modern multi-CPU/multi-core hardware.
3) Using native compilation: T-SQL is an interpreted language that provides great flexibility at the cost of CPU overhead. Even a simple statement requires hundreds of thousands of CPU instructions to execute. The In-Memory OLTP Engine addresses this by compiling row access logic and stored procedures into native machine code that can be executed immediately by the CPU, without the need for any further compilation or interpretation.

Memory Optimized Table:

Even though the creation of memory-optimized tables is very similar to the creation of on-disk tables and can be done with a regular CREATE TABLE statement, SQL Server works very differently with memory optimized tables. Every time a memory-optimized table is created, SQL Server generates and compiles a DLL that is responsible for the manipulation of table row data. The In-Memory OLTP Engine is generic, and it does not access or modify row data directly. Rather, it calls DLL methods instead. As you can guess, this approach adds limitations on the alterability of the table. Alteration of the table would require SQL Server to recreate a DLL and change the format of data rows. It is not supported in SQL Server 2014, and the schema of a memory-optimized table is static and cannot be altered in any way after it is created. The same is true for indexes. SQL Server requires you to define indexes inline in a CREATE TABLE statement. You cannot add or drop an index or change an index’s definition after a table is created. SQL Server 2016 allows you to alter table schemas and indexes. This, however, creates a new table object in-memory, copying data from the old table. This is offline operation, which can be time- and resource-consuming and requires you to have enough memory to accommodate multiple copies of the data. You can combine multiple ADD or DROP operations into a single ALTER statement to reduce the number of table rebuilds.
Indexes on memory- optimized tables are not persisted on-disk. SQL Server recreates them at the time when it starts the database and loads memory-optimized data into memory. As with on-disk tables, unnecessary indexes in memory-optimized tables slow down data modifications and use extra memory in the system.

Data model:

Pages and Extents are the data structures used for traditional disk based tables. The rows for this kind of table are identified by page number and offset inside the page. These kind of data structures are optimized for Pessimistic Concurrency which uses Locking and Latching for allowing concurrent reads and writes. This is not true for Memory-Optimized Tables.

In order to provide better efficiency and concurrency, the rows for MOTs are organized differently, as shown in the following figure. To allow concurrent reads and writes, MOTs uses Optimistic Concurrency which is implemented using Row Versioning, resulting in the following structure for the Row Header. Data structures used in MOTs are Pointer and Row chains.

The Begin and End Timestamp is used to control the “Row Visibility”. The Begin Timestamp contains the time of the creation of the row. Once the End timestamp is set, the row is interpreted as “removed”. It is important to understand that a removed row will still be stored in memory for a while. Once the Garbage Collector runs and there are no more active transactions with a time prior to the one set as End Timestamp, the row will be cleared from memory. Notice that a removed row can be generated not only from DELETE statements, but also from UPDATE statements. Under the hood, an UPDATE is doing an INSERT and a DELETE, because it is more efficient to create a new row with the modified data than to update the current one. Aborted INSERT transactions are also a possible cause for deleted rows.

Next 8 bytes contains The Statement ID is the unique ID value for the statement that created the row. The “IndexLink Count” is a number indicating how many indexes are referencing the row.

The remaining slots are used to store index pointers (8 bytes each), which are linking the rows of a particular table. That is the reason why every memory-optimized table must have at least one index.