Memory Optimized Tables (MOT) are fully ACID (Atomicity, Consistency, Isolation and Durability) compliant data storage structures. They are stored as an organized sets of rows, and follow a special BW-Tree structure in order to support non-clustered hash indexes (At least one index is needed for MOTs to link rows with each other). Data is stored mainly within the program’s memory, where it is accessible to users. The data can also be copied onto the system’s hard drive by changing durability of MOT table. Each table has multiple versions of a single row, which allows for a row to be concurrently read and written. Memory optimized tables are durable, transactional and accessible through Transact-SQL. These features improves a MOT’s performance by 5 to 20 times. Lets see all the above features in more detail.
How Concurrency is achieved?
By default, SQL Server adopts a pessimistic approach to concurrency, acquiring locks in order to avoid read phenomena such as dirty reads, non-repeatable reads and phantom reads. As a result, readers block writers and writers block readers. Under an alternative concurrency model, enabled via snapshot-based isolation, SQL Server can prevent some or all of these read phenomena, depending on the mode of snapshot-based isolation in use, without the need to acquire locks, therefore greatly reducing blocking in the database. In order to achieve this “optimistic” concurrency, SQL Server uses a row versioning technique, whereby it stores in
tempdb copies (versions) of all the previously committed versions of any data rows, since the beginning of the oldest open transaction (i.e. it keeps those copies as long as there are any transactions that might need to access them). The space in
tempdb used to store previous versions of changed rows is the version store. When using this row versioning, readers do not block writers, and writers do not block readers (though writers do still take locks and will block other writers). For more details about row versioning and snapshot isolation https://www.red-gate.com/simple-talk/sql/t-sql-programming/row-versioning-concurrency-in-sql-server/
Memory Optimized Tables goes one step further and eliminates locking completely thus writer will not block writer it will generate error (see snapshot isolation section below for more information). Let’s now see what happens during insert and delete.
INSERT: When new row is created it has a BEGIN TS indicating when its came into existence, and Infinite (∞) END TS indicating that this row is the active current version of the row.
When row is still getting inserted i.e not yet committed (in transaction) TxnId is inserted in END TS and once transaction completes Infinite (∞) END TS is placed replacing TxnId.
UPDATE: When row content is updated or deleted copy of the row is created and END TS is placed in the old version of the row replacing infinite entry.
Old Active transactions still using this row continue to use version because their transaction start time is between those two TS. Once that transaction is complete
New transactions that begin after 1320 will use this new version. As active transactions finalize rows will eventually be available for GC, when this occurs it can be dropped from memory.
How Durability is achieved?
Each memory-optimized table has a DURABILITY option. The default SCHEMA_AND_DATA option indicates that the data in the tables is fully durable and persists on disk for recovery purposes. Operations on such tables are logged in the transaction log, which allows SQL Server to support database transactional consistency and recreate the data in the event of a SQL Server crash or unexpected shutdown. SCHEMA_ONLY is another option and indicates that data in memory-optimized tables is not durable and would be lost in the event of a SQL Server restart or crash. Operations against non-durable memory optimized tables are not logged in the transaction log. Non-durable tables are extremely fast and can be used if you need to store temporary data in use cases similar to when you would use temporary tables in tempdb . As the opposite to temporary tables, SQL Server persists the schemas of non-durable memory optimized tables, and you do not need to recreate them in the event of SQL Server restart.
CREATE TABLE dbo.Customers ( [CustomerID] int identity(1,1) not null constraint PK_Customers primary key nonclustered hash with (bucket_count = 131072), [CustomerName] varchar(128) collate Latin1_General_100_BIN2 not null, [City] varchar(64) collate Latin1_General_100_BIN2 not null, [SSN] char(9) not null, [DateOfBirth] date not null, INDEX IDX_Customers_City nonclustered hash(City) with (bucket_count = 16384), INDEX IDX_Customers_Name nonclustered(CustomerName) ) WITH(memory_optimized = ON, durability = SCHEMA_AND_DATA)
- SQL logs in memory transactions only upon commit. Logged information is smaller than similar traditional transactions as memory optimized tables have no undo information.
- Write to Checkpoint File Pairs(CFP) is not page based and relies on a checkpoint process. It is managed by a background thread called the checkpoint worker thread, the checkpoint worker thread streams data to checkpoint file pairs (CFP)
- Traditional checkpoints still occur and additional IMOLTP related checkpoints also occur marking log records inactive (1.5 GB)
- Memory-optimized tables support at most eight indexes and unlike B-Tree indexes on traditional disk based tables, these indexes don’t duplicate data, but rather just point to the rows in the chain.
- Durable memory-optimized tables should have a unique primary key constraint defined. Non-durable memory-optimized tables do not require the primary key constraint; however, they should still have at least one index to link the rows together.
Which Isolation Levels are supported?
COMMIT DEPENDENCY and PRE-COMMIT Validation:
When row is not yet committed it has TxnId and we know that it will commit (optimistic). This will create commit dependency and these transactions will fail if they were depending on some other transaction which did not commit without causing blocking. So there will be error given to user.
- Snapshot Isolation (default): Only rolled back on primary key violation
- Repeatable Read Isolation: Any violation causes roll back
- Serializable Isolation: Any violation causes roll back