One of the more recent additions to SQL Server security features is the Dynamic Data Masking (DDM), included with the 2016 version. Like the Transparent Data Encryption I blogged about recently, DDM is a feature that is relatively easy to implement, and doesn’t require a lot of changes to the application. And just like pretty much everything is easy in a real life, it too has some limitations.
Dynamic Data Masking
DDM works by obfuscating parts of the query results based on the function used and the permissions the user have. The best way to think of the feature is, that rather than encrypting the data or changing it inside the database, it’s just limiting your view on it. Because of this, it’s not really the feature that will provide lot of extra security by itself and should be used together with other security implementations.
DDM will also work out-of-the-box with many applications due to fact that the database contents are not changed, just the query results.
There are however quite a few cases where it can simplify processes a great deal with minimum effort. Let’s say f.ex. that you have a situation where you need to allow an IT staff or a Developer to view tables that has Personally Identifiable Information (PII) data that they shouldn’t see in some of the columns. Then DDM might be the solution for that.
Let’s see how to implement something like that!
Implementing Dynamic Data Masking
Implementing DDM is quite easy even into existing tables, let’s use StackOverflow database and it’s User table as an example. There are few columns that look like good candidates for this purpose, DisplayName and Location
Adding a data mask is pretty much just executing two commands. First we add masking with default function to DisplayName
ALTER TABLE dbo.Users ALTER COLUMN DisplayName ADD MASKED WITH (FUNCTION = 'default()');
After which we then add one more mask to Location using the partial function. I’ll have bit more information about the various functions later on.
ALTER TABLE dbo.Users ALTER COLUMN Location ADD MASKED WITH (FUNCTION = 'partial(5, "YYYY", 0)');
And there you have it! Almost anyway, you’ll still need to do some work with the permissions as the users with UNMASK are the only ones who can view the data as it is. So let’s say that we have a developer we want to grant access to view some of the data, but mask the important bits.
CREATE USER [Developer] WITHOUT LOGIN GRANT SELECT ON dbo.Users TO Developer;
To simplify this a bit, I am just creating a user but not a login. It’ll be enough for the demo purposes. In real life, you’d be creating the login and then adding it as a user to the database. Let’s see what happens when we run a query using the Developer user.
EXECUTE AS USER = 'Developer'; SELECT TOP 20 U.Id , U.DisplayName , U.LastAccessDate , U.Location FROM dbo.Users AS U WHERE Id > 0; REVERT;
And there it is, masking the two columns just the way we set it up. Default function we used for DisplayName replaces the text with 4 X’s and partial masking we added for Location keeps first 5 letters, then replacing everything after that with 4 Y’s.
To allow users to see the actual values, we need to give them the UNMASK permissions. Let’s create another database user for this purpose.
CREATE USER [SuperDeveloper] WITHOUT LOGIN; GRANT SELECT ON dbo.Users TO [SuperDeveloper]; GRANT UNMASK TO [SuperDeveloper];
Less surprisingly, when we run the following query
EXECUTE AS USER = 'SuperDeveloper'; SELECT TOP 20 U.Id , U.DisplayName , U.LastAccessDate , U.Location FROM dbo.Users AS U WHERE Id > 0; REVERT;
We once again see everything without the masking applied. And that’s pretty much it. Next let’s look at briefly at the different functions we can use for the DDM.
Dynamic Data Masking Functions
We’ve already looked at two of the masking functions in DDM, the default and the partial. There are also two other functions called random and email. The email function is quite easy to guess on what it is used for, emails! It exposes first letter of an email address and replaces rest with XXX@XXX.COM, so if your email would be email@example.com, it’d appear jXXX@XXXX.com after being masked.
Random function is meant for masking numeric values and it does this by, as you probably already guessed it, using random numbers. All these different masking functions are actually quite nicely documented by Microsoft, so for the curious readers: You can find the full details of the masking functions behind this link.
Limitations of Dynamic Data Masking
As already mentioned earlier, DDM has very good things going for it. It’s relatively straightforward and easy to implement and in most cases it’s also transparent to the applications. However, things that sound this great usually have some drawbacks, and this same applies to DDM.
The weak part of the DDM is, that it’s only changing how the data is displayed. This means that you can just brute force your way around it. Let’s say that we want to find out users who are located in New York.
EXECUTE AS USER = 'Developer'; SELECT TOP 20 U.Id , U.DisplayName , U.LastAccessDate , U.Location FROM dbo.Users AS U WHERE Location = 'New York' REVERT;
DDM isn’t actually blocking you from finding these things out, it’s just making it little more inconvenient. While this was quite a simple example, you could also find out things like salaries hidden behind random masking by using BETWEEN in the WHERE clause.
So while definitely a useful feature in many cases, don’t rely on this feature alone to keep your data safe and protected.