Data quality and accuracy are foundational to a winning data strategy, and design choices for your organization’s data platform can bolster your data quality efforts. Implementing temporal tables can be a great tool for supporting data quality and accuracy, especially for tracking historical data changes and enabling point-in-time analysis. Unlike typical relational database tables, temporal tables automatically capture the time period each record is valid in the table, which is referred to as system versioning.
In this blog post, which is the first in a series of two posts, we’ll show you how to create, insert into, and query temporal tables. We’ll also investigate four specific use cases for temporal tables—historical trend analysis, point-in-time reporting, slowly changing dimensions, and compliance auditing.
Creating Temporal Tables
There are two requirements when creating a temporal table—a primary key and system time columns. A primary key is necessary to tie the records in the current table to the records in the history table. System time columns are row start and row end columns used to record the period of time during which a record was current or valid. The datetime values in the row start and row end columns are always system generated and will change as data is updated or deleted. Here’s an example create statement we used for a system-versioned customer table:
CREATE TABLE dbo.Customer ( [CustomerID] INT NOT NULL PRIMARY KEY ,[FirstName] VARCHAR(40) ,[LastName] VARCHAR(40) ,[Address] VARCHAR(120) ,[RowStart] DATETIME2 GENERATED ALWAYS AS ROW START ,[RowEnd] DATETIME2 GENERATED ALWAYS AS ROW END ,PERIOD FOR SYSTEM_TIME (RowStart, RowEnd) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Customer_History )) ;
In this example, we’ve defined a CustomerID field as the primary key and two system time columns that we’re calling RowStart and RowEnd. You’ll also notice that we don’t have to define a Customer History table separately. With temporal tables, the history table is defined as part of the current table.
Inserting and Updating Data in Temporal Tables
The syntax for inserts and updates on temporal tables is just like performing operations on regular tables. The key difference is the behavior of the table itself.
When inserting data into a temporal table, the row start column will be populated with the current system time and the row end column will be populated with the maximum datetime value of 23:59:59 on 12-31-9999. We ran inserts to populate our system-versioned customer table with several customer records:
INSERT INTO dbo.Customer ([CustomerID], [FirstName], [LastName], [Address]) VALUES (1, 'John', 'Smith', '15303 Dallas Pkwy #300'); INSERT INTO dbo.Customer ([CustomerID], [FirstName], [LastName], [Address] ) VALUES (2,'Jane', 'Doe', '999 18th Street Suite 1360-S'); INSERT INTO dbo.Customer ([CustomerID], [FirstName], [LastName], [Address] ) VALUES (3, 'Harry', 'Potter', '825 Town & Country Ln. Suite 550');
When updating data in a temporal table, the new version of the record will be stored in the current table, while the previous version will be pushed to the history table. The record pushed to history will keep the row start value of when it was created, but the row end value will be updated to show the time it stopped being the current record. We can see this behavior when we update a customer’s address with the query below.
UPDATE dbo.Customer SET [Address] = '225 N Michigan Ave' WHERE CustomerID = 2;
The row start time of the new current record is set to the current system time and the row end will still be the maximum datetime value. You’ll notice that the row end time of the original address record, which is now in history, matches the row start of the current record. As we continue to update this customer record, the previous versions are stored in the history table, and the row end is updated to reflect that the record is no longer current.
UPDATE dbo.Customer SET [Address] = '220 E 42nd St' WHERE CustomerID = 2;
Four Use Cases
Now that we’ve covered the basics of how to create, insert into, and update temporal tables, we’ll dive into four specific use cases for temporal tables—historical trend analysis, point-in-time reporting, slowly changing dimensions, and compliance auditing. For each use case, we’ll provide example queries and show why temporal tables are especially helpful for these cases.
1. Historical Trend Analysis
Because temporal tables track the value of a record over time, they’re especially helpful for looking at historical trends. Temporal tables allow you to query across the current and history table using the FOR SYSTEM_TIME clause.
To see a complete history of changes, we can use FOR SYSTEM_TIME ALL in our queries. This returns every version of the record and has a wide range of business applications. For example, we can easily provide customers with a report of their historical account balances with a query like the one shown below.
2. Point-in-Time Reporting
Temporal tables facilitate simple yet powerful querying of historical values, which we’ll demonstrate in the examples below.
The following query will return up to one record for this particular account and will display the account balance which was active at the time of the datetime field. This type of query is valuable for reporting on a particular point in time, such as when performing an audit or conducting data forensics.
SELECT * FROM dbo.AccountBalance FOR SYSTEM_TIME AS OF '2019-09-03 01:35:00' WHERE AccountNumber = 415
The following query will return all rows for our particular account that fall between the RowStart and RowEnd fields. One useful example would be using start-datetime and end-datetime as the upper and lower bounds of a time period, like a month. The resulting records would return all data from that time period for the account selected.
SELECT * FROM dbo.AccountBalance FOR SYSTEM_TIME FROM '2019-09-03 01:32:00' TO '2019-10-03 00:00:00' WHERE AccountNumber = 415
3. Slowly Changing Dimensions
Temporal tables are an excellent method when needing to implement a slowly changing dimension in your data model. In our example, our Customer table expects to see some changes over time: creation of new customers, the alteration of customer accounts, and the expiration of open accounts. Temporal tables provide a simple way to implement each of those features, as well as a straight-forward method to query your table for historical values.
The query below is similar to our point-in-time reporting query above. It uses the FOR SYSTEM_TIME expression to find the single customer record that was active at the time of the datetime field. This method can be useful in a data warehouse, when attempting to pair the correct dimension field with a transaction record, for example.
SELECT * FROM dbo.Customer FOR SYSTEM_TIME AS OF '2019-09-03 00:43:10' WHERE CustomerID = 2
4. Compliance Auditing
With newly introduced data privacy regulations, such as GDPR or CCPA, temporal tables can provide a potential solution for addressing data retention concerns. Temporal tables provide excellent functionality in tracking historical data, and by default this data will be tracked indefinitely. This provides a robust dataset when trying to build reports but could leave your organization at risk when storing sensitive information for extended periods of time.
Temporal tables provide historical retention period settings to help address this issue. When creating or altering a temporal table, this optional setting allows users to define how long to keep data in the table in question. The system will manage this functionality, meaning users will no longer have to manually delete data after a certain point. Users can provide integer values for four units of time: days, weeks, months, and years. The default option is infinite retention. In the statement below, our table will delete Customer records after the record has been in the system for seven years.
ALTER TABLE dbo.Customer SET (SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo.Customer_History , HISTORY_RETENTION_PERIOD = 7 YEARS));
When Not to Use Temporal Tables
The use cases listed above show benefits of using temporal tables in your database, but this functionality is not without limitations. The limitations listed below can increase development time for your team or limit existing functionality in your system. Important potential drawbacks of temporal tables include:
Must have primary key defined for table.
Cannot truncate data.
Cannot affect historical data without altering the table (we’ll touch on this in a subsequent blog post).
Datatype varchar(max) will incur significant storage costs and limit performance.
The temporal history table must be defined in the same database as the current table.
In this blog post we’ve covered the basics of temporal tables, including a few use cases and example queries to apply to those cases. In the second part of our temporal table blog series, we’ll cover more advanced topics, such as correcting historical data and patching system times.
If you would like more information on getting started with temporal tables, we’d love to help! Feel free to reach out to us at email@example.com.