SQL Server 2016 CTP2 finally available for a public and it brings many new features, which are worth of checking.
However, one of those new initiatives, I believe, can bring some discussions in DWH/BI environments. The name of it is: Temporal table. It can be especially actual for a BENELUX area, where the DataVault methodology is highly adopted.
Seems, Microsoft, according to the best company traditions, brings another naming confusion. Currently SQL Server ships temporary tables and table variables. But, with a new release, temporal table is a new guy in a toolbox, however usage of the new tool is completely different:
A temporal table is a table for which a PERIOD definition exists and which contains system columns with a datatype of datetime2 into which the period of validity is recorded by the system, and which has an associated history table into which the system records all prior versions of each record with their period of validity. With a temporal table, the value of each record at any point in time can be determined, rather than just the current value of each record. A temporal table is also referred to as a system-versioned table. Read more in online documentation.
First of all, new database has to be created with a compatibility level SQL 2016 (or 130):
-- step 1: Create database, tables and seed dummy data CREATE DATABASE SQL2016Demo; GO USE SQL2016Demo; CREATE TABLE dbo.Product ( ProductID INT IDENTITY PRIMARY KEY, ProductName VARCHAR(100) NOT NULL ) CREATE TABLE dbo.Project ( ProjectID INT IDENTITY PRIMARY KEY, ProjectName VARCHAR(100) NOT NULL, ProductID INT FOREIGN KEY REFERENCES dbo.Product(ProductID) ) GO INSERT INTO dbo.Product (ProductName) VALUES ('Product A'); INSERT INTO dbo.Project (ProjectName, ProductID) VALUES ('Project A' , ( SELECT ProductID FROM Product WHERE ProductName = 'Product A' ) ); SELECT * FROM dbo.Product; SELECT * FROM dbo.Project; GO
As a result, very simple data model prepared with Just one row per entity:
Meet new friend – “system versioning”
The most important part starts by executing following block of the code:
-- step 2: Enable system versionning BEGIN TRAN -- Product ALTER TABLE dbo.Product ADD PERIOD FOR SYSTEM_TIME (Valid_From, Valid_Till), Valid_From datetime2 GENERATED ALWAYS AS ROW START NOT NULL DEFAULT GETUTCDATE(), Valid_Till datetime2 GENERATED ALWAYS AS ROW END NOT NULL DEFAULT CONVERT(DATETIME2, '9999.12.31'); ALTER TABLE dbo.Product SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Product_History, DATA_CONSISTENCY_CHECK = ON)) -- Project ALTER TABLE dbo.Project ADD PERIOD FOR SYSTEM_TIME (Valid_From, Valid_Till), Valid_From datetime2 GENERATED ALWAYS AS ROW START NOT NULL DEFAULT GETUTCDATE(), Valid_Till datetime2 GENERATED ALWAYS AS ROW END NOT NULL DEFAULT CONVERT(DATETIME2, '9999.12.31'); ALTER TABLE dbo.Project SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Project_History, DATA_CONSISTENCY_CHECK = ON)) COMMIT TRAN GO -- check table content again: SELECT * FROM dbo.Product; SELECT * FROM dbo.Project;
As you can see, two system-maintained columns were added to both tables:
Yes, that is right, since columns are system-maintained they hold non-static values and SQL Server will keep them actual on every DML operations. It can been shown in the following example:
-- step 3: update data in the tables BEGIN TRAN UPDATE dbo.Product SET ProductName = 'Product B' WHERE ProductID = 1 INSERT INTO dbo.Project (ProjectName, ProductID) VALUES ('Project B', 1 ); COMMIT TRAN SELECT * FROM dbo.Product; SELECT * FROM dbo.Project;
As the result, name of the product changed to “Product B” and new project created. While data in a second dataset looks valid and both operations are reflected, data of the first dataset represents current state only, in other words, previous version of the row is not appearing. The reason is simple – by default, only current and active members are queryable. If previous version of the record has to be retrieved and shown, special construction can be used as a part of FROM clause:
-- Step 4: Query historic data SELECT * FROM dbo.Product FOR SYSTEM_TIME FROM '2000-01-01' to '9999-12-31' Order by Valid_From;
This time, the result looks like expected and both versions of the row retrieved.
Make it working
Let’s put all information together and solve simple task: perform join of entities and retrieve historic information, so data should be valid for some particular point in time.
Generic query looks little bit overwhelmed, since sub-queries to versioned data written as derived tables. The first dataset is retrieving state of data at moment of initial seeding:
DECLARE @dtPointInHistory DATETIME2 = '2015-05-31 18:13:10.09' SELECT prj.ProjectID, prj.ProjectName, prd.ProductName -- wrapper for a project FROM ( SELECT ProjectID, ProjectName, ProductID FROM dbo.Project FOR SYSTEM_TIME AS OF @dtPointInHistory ) prj -- wrapper for a product JOIN ( SELECT ProductID, ProductName FROM dbo.Product FOR SYSTEM_TIME AS OF @dtPointInHistory ) prd ON prj.ProductID = prd.ProductID
The result looks correct, since only one row per entity had added and product still had name: “Product A”.
In order to retrieve actual state of data, current timestamp assigned to the variable:
DECLARE @dtPointInHistory DATETIME2 = GETUTCDATE()
The result looks as expected again – product name changed and two projects assigned to that single product.
What can Temporal Tables mean for SQL Server DWH/BI community?
In short, this feature can serve as a foundation brick for DWH system. Just couple of simple steps required to make entire solution working:
- build a plain, relational, but not over-normalized datamodel;
- enable system versioning on every entity of the datamodel;
- load data directly to entities by using MERGE statements, wrapped in a single transaction;
That is it! No need to model and create structures like links, satellites, then satellites for links, then PIT tables for links and their satellites just to store historical information of couple of entities and then build even more complex queries to retreive data. No need also to create specialized DWH feeding ETLs by itself, which are supposed to load all that variety of object in a consistent way.