Some of you have probably heard about Dynamic Data Masking, and for those of you who haven’t, well, it’s only a matter of time before you’ll be using it. Companies face new requirements that force them to rethink their approaches to data privacy and implement new protections. The term ‘pseudonymisation’ has been introduced to encourage protection through measures like dynamic data masking technologies.

Data Masking (a.k.a. “data obfuscation” and “data scrambling”) provides organizations and data owners with better control over the exposure of their sensitive information.

For the past 20 years, Static Data Masking has been the go-to solution for masking data that’s being copied to a different location. This kind of solution is aimed at creating a secure development, training or other “safe” replica environment with the copied data.

We previously published an article, Dynamic vs. Static Data Masking, which explains the differences between the static and dynamic approaches, which I recommend you check out before continuing here.

With new EU regulations that affect companies worldwide along with the non-stop increase in massive data breaches. Here’s what you need to know to secure your sensitive and confidential information.

Dynamic Data Masking

Dynamic data masking means replacing particular information fields on-the-fly. This is how it works: When any application retrieves information from a database, the request first passes through a data masking apparatus that rewrites the query “on-the-fly” and then passes it on to the database. In other words, the database receives a query that includes more detailed instructions about exactly what data to return to the application.

This means that the Dynamic Data Masking mechanism works as a “reverse-proxy” to the database. This might sound limiting at first, because you need to re-route the traffic to the database through this apparatus, but implementing a reverse proxy in your infrastructure actually provides you with the flexibility you need to decide which applications need to pass their database requests through the reverse proxy and which do not.

That is the way that most Dynamic Data Masking solutions work, but it’s not enough to satisfy the needs of today’s databases.

Preparing to Mask

Before setting up masking, you need to figure out which sensitive information you need to mask and where it is located. It doesn’t matter which Dynamic Data Masking tool you choose if you don’t know exactly what information to mask and where that information is located. Without this knowledge, the tool will be useless.

With HexaTier Dynamic Data Masking, the first task that HexaTier performs is to scan your database for sensitive information. You can select one or more particular regulations with which your organization must comply in order to ensure that all regulated data is identified. For example, if you must comply with HIPAA, HexaTier can scan your database to locate any for any Personally Identifiable Information (PII) fields that are specifically regulated by HIPAA.

The scan only takes a few minutes. When it’s complete you will receive a list of tables and corresponding columns that contain sensitive information. HexaTier can automatically generate all the essential Dynamic Data Masking rules you need to protect these fields.

Screenshot of HexaTier’s Automatic Sensitive Data Discovery feature:

Sample of GreenSQL Automatic Sensitive Data Discovery resultsOnce you have your sensitive information mapped, you can move ahead with the Dynamic (or Static) Data Masking implementation.

How Deep does the Rabbit Hole Go?

While most Dynamic Data Masking solutions make sure you are able to perform “on-the-fly” masking of information retrieved using native SQL queries (which means classic View commands sent from an application to the database), they are completely blind to other types of information retrieval methods such as Stored Procedures or Views.

Stored Procedures cannot be dynamically masked by re-writing the queries at the request level because the Stored Procedure execution plan is already stored in the database, and the application just requests its execution, sometimes with specific parameters. In these cases, the solution is to mask information “on-the-fly” at the database response level, and not at the request level.

HexaTier, however, focuses on the data itself; regardless of the method the application uses to retrieve the information, HexaTier will dynamically mask the information at the request or the response level in a fully transparent way.

In order to perform at this level, HexaTier needs to parse all the Stored Procedures and Views located inside the database and to fully understand which Stored Procedures or Views access which information. In doing so, HexaTier provides managers with peace of mind that managing the exposure of their sensitive information at any level: information, table, column level and even raw.

The Grass isn’t Always Greener

Many of the Dynamic Data Masking tools works at the query level, meaning you have to “capture” the query which retrieves the information and then you have multiple options on how to re-write this specific query into the desired query that includes the Dynamic Data Masking enforcement. But…. who cares which query or which method is used to retrieve the information?

HexaTier provides you the option to choose the columns or tables that contain the sensitive information, a process which can be performed manually, or automatically use the Sensitive Data Discovery component which comes ready to use out-of-the-box with any HexaTier installation.

Once the locations that contain the sensitive information have been set, it doesn’t matter which query or which method is used to retrieve this information. You can still limit the source you wish to implement the policy upon, but the rest is completely transparent.

HexaTier Dynamic Data Masking Column Selection and policy creation:

dynamic data masking column selectionOnce Step Forward: Conditional/Raw Level Dynamic Data Masking:

Once you have chosen the columns you may find that sometimes that won’t be enough. Masking at the column level means any application that wishes to view the information get the entire column masked. Which is very limiting when you would like to dynamically mask the information according to specific parameters already used in the database. Because of this limitation, we can take Dynamic Masking one step forward and consider Raw Level Masking, also referred to as Conditional Dynamic Data Masking.

Conditional Dynamic Data Masking provides you with the option of adding conditions of how and when the Dynamic Data Masking should operate according to the value of a different column.

For example, in the screenshot below, I have a table containing some sensitive information of my employees:

fixed string masking definitionAs shown above, any DBA or application that has access to the production database will be able to view this sensitive information.

The first thing that people mention when they see this is Encryption, so let me stop you with this for a second. Encryption is important! But it’s also important to understand that once you provide an application with the keys to decrypt the data, you have no control whatsoever regarding who is exposed to which type of information.

In the following configuration, we have chosen to dynamically mask the content of the “LoginID” and “National ID” columns located under Employee. We have also chosen the masking behavior “Empty” which should empty the content of these columns:

Dynamically mask LoginIDCreating this rule will dynamically empty the information of these columns at the presentation level, without changing the data at rest. And now the result looks like this:

mask human resources pii

This is great, but it’s not enough when applications are using this information for multiple users and multiple usages. And for this, we can use conditional masking!

In the following configuration HexaTier will dynamically empty the information of the “LoginID” and “National ID” columns located under Employee only if the “ManagerID”, which is the third column, contains a value which is bigger than “3”:

greensql-data-masking-login

Once this Rule has been defined, the information will look like this:

viewing masked data with greensql

Which means that if an application retrieves information for a user, group or source that is defined as ManagerID “3” or smaller, then only then the information will be visible.

This kind of configuration can be set with more than a dozen masking behaviors, for example, fixed string, which by default is configured as “CONFIDENTIAL”:

fixed string masking definitionAnd in this case, the information will look like:

masking sensitive data at raw level

Controlling the exposure of the sensitive information at the raw level provides organizations with an additional layer of control.

Some conclusions:

HexaTier provides you with the option to focus on the most important thing: the data itself. And it makes sure that the data will not be exposed.

While static data masking makes sure that a copy of the production data is masked, it has no influence whatsoever on the production databases level of security/exposure.