Wednesday, January 22, 2014

AppLocks in Hekaton

All of you who followed the development of the SQL Server 2014 InMemory OLTP Engine (aka “HEKATON”) will know that Hekaton per definition does not support locks. While this is a good thing per se, after all it’s all about speed, there are scenarios where you would need locks to ensure data integrity and avoid massive amounts of retries. Now you can of course go down the easy road and use sp_getapplock to take your own, application intended, lock to circumvent the problem, but that approach comes with major drawbacks:

1) To do that you need to have a transaction around your code block, which can unnecessarily slow down your operations and in case of “interlocking locks” be a real annoyance.

2) More importantly if you go down that road you are bound to hit the transaction log, even if all your tables in the transaction are in memory only. And THAT can really bring your performance down.

So… What to do about it? Well… If you are me… Write your own sp_getapplock using Hekton tables. The Pro’s of that are that you neither need a transaction nor hit the TLog of your database, the Con is that you are outside the control of SQLs Lock Manager, meaning that you have to ensure you cleanly release those “locks” on all possible codepaths.

What does that look like? Well… Take a look…

1) You need a table to store your locks in. I used a 200 characters nvarchar as a key, but you are really open to do whatever suits you:

CREATE TABLE dbo.HK_AppLock
(
    LockKey nvarchar(200) COLLATE LATIN1_GENERAL_100_BIN2 NOT NULL,
    LockDate datetime NOT NULL

   CONSTRAINT PK_HK_AppLock PRIMARY KEY NONCLUSTERED HASH (LockKey) WITH (BUCKET_COUNT=300000)
   ,INDEX IX_HK_AppLock_Expired (LockDate)
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY)

Note that I have two indexes on that table. The Primary Key is built as a Hash Index to ensure maximum speed on point lookup (which is the normal scenario), the Expiry-Index is used for orphaned lock cleanup only (Range Scan). Also note that I use SCHEMA_ONLY duarability as persisting locks is not really something you need in daily operations…

2) Build Sprocs for GetAppLock and ReleaseAppLock

CREATE PROCEDURE sp_HK_GetAppLock
    @Key nvarchar(200)
AS
    SET NOCOUNT ON

    DECLARE @Error int
    DECLARE @LockTaken bit=0
    WHILE @LockTaken=0
    BEGIN
        BEGIN TRY
            INSERT INTO HK_AppLock(LockKey, LockDate)
            VALUES (@Key, GETUTCDATE())

            SET @LockTaken=1
        END TRY
        BEGIN CATCH
            SET @Error=ERROR_NUMBER()

            IF (@Error = 41302 OR @Error = 41301 OR @Error = 41305 OR @Error = 41325)
                WAITFOR DELAY '00:00:00:001'
            ELSE IF (@Error = 2627 OR @Error = 2601)
                WAITFOR DELAY '00:00:00:050'
            ELSE
                THROW
        END CATCH
    END

The idea is pretty simple here… Insert the requested lock into the table. If someone else holds the lock already (=the record is already in the table) we will see a PK violation, which is the indicator here. The Catch-Block handles three scenarios: First block handles W/W conflicts with a very short delay + retry, second block handles the PK violation, taking a longer delay in sort of a “spin lock” approach, third block throws whatever other exception we run into up to the client. So effectively what you get is a block until the Insert succeeds.

Note that I do not use native compilation here. Reason for that is that in Native sprocs you have no way of “retrying”, as the transaction context will never change in there.

CREATE PROCEDURE sp_HK_ReleaseAppLock
    @Key nvarchar(200)
AS
    SET NOCOUNT ON

    DECLARE @Error int
    DECLARE @LockReleased bit=0
    WHILE @LockReleased=0
    BEGIN
        BEGIN TRY
            DELETE FROM HK_AppLock
            WHERE LockKey=@Key

            SET @LockReleased=1
        END TRY
        BEGIN CATCH
            SET @Error=ERROR_NUMBER()

            IF (@Error = 41302 OR @Error = 41301 OR @Error = 41305 OR @Error = 41325)
                WAITFOR DELAY '00:00:00:001'
            ELSE
                THROW
        END CATCH
    END

Release goes the same way as Get, No magic here.

3) Last thing is to simulate SQLs lock manager. Meaning: Add a way to clean up orphaned locks.

CREATE PROCEDURE sp_HK_CleanupAppLock
AS
    SET NOCOUNT ON

    WHILE 1=1
    BEGIN
        BEGIN TRY
            DELETE FROM HK_AppLock
            WHERE LockDate<DATEADD(s, -30, GETUTCDATE())
        END TRY
        BEGIN CATCH
        END CATCH
        WAITFOR DELAY '00:00:05'
    END

What I do in my environment is have an Agent job starting every minute, calling that sproc. The sproc actually never finishes, so the 1-minute-schedule is just a precaution in case of problems and server restarts. In my implementation we assume that every lock existing longer than 30 seconds is orphaned. You can of course build more sophisticated logic in there, like including SPIDs or whatever else in the table to allow for a “real” lock management.

All in all the code is very simple and highly efficient. (at least compared to sp_GetAppLock…) It doesn’t solve all problems, but for us it sure solved many of them.