What is data masking? and how it is useful to prevent data security?

M.F.M Fazrin
4 min readFeb 23, 2023

--

Data masking is a process of obscuring or replacing sensitive data with fictitious or scrambled data, while still preserving the format and functionality of the original data. This technique is commonly used to protect sensitive data such as Personally Identifiable Information (PII) and protected health information (PHI).

There are different types of data masking techniques that can be used for different purposes and scenarios. Some of the common ones are:

  • Encryption: This technique transforms data into a different format that can only be read with a key.
  • Scrambling: This technique rearranges the characters or numbers in a random order, making the original data unreadable.
  • Pseudonymization: This technique replaces sensitive data with fake, realistic data such as names or addresses.
  • Nulling out or Deleting: This technique removes sensitive data completely or replaces it with null values.
  • Substitution: This technique replaces sensitive data with other values from a predefined list or range.
  • Shuffling: This technique swaps values between rows or columns, preserving the format but changing the meaning.
  • Data Variance: This technique adds or subtracts a random value from numeric data, creating a variation from the original data.

Difference between static and dynamic data masking

The main difference between static and dynamic data masking is that static data masking permanently replaces sensitive data in a copy of the database, while dynamic data masking temporarily hides or replaces sensitive data in the query result.

Static data masking is useful for providing realistic and secure data for testing or development purposes. Dynamic data masking is useful for limiting access to sensitive data for certain users or applications.

To implement static data masking in SQL Server, you can use the Static Data Masking feature that is available in Azure SQL Database and SQL Server Management Studio (SSMS).

With this feature, you can configure how each column in your database is masked, and then create a masked copy of your database with new data generated according to your configuration 1. The original data cannot be unmasked from the masked copy.

You can use SSMS to connect to your SQL Server instance and launch the Static Data Masking wizard. You can then select the database and columns that you want to mask, choose a masking function for each column, preview the masked data, and export or import your masking settings.

Dynamic Data Masking

SQL Server provides several built-in data masking functions that can be used to mask data in a database. These functions can be used to replace sensitive data with random values, partial values, or null values. Here are some examples of data masking in MS SQL Server:

  1. Partial masking with the EMAIL() function Suppose you have a table named Users that contains a column Email which contains sensitive information that needs to be masked. You can use the EMAIL() function to partially mask the email address by replacing the first three characters of the email with xxx and keeping the domain name intact. Here's an example:
UPDATE Users SET Email = EMAIL(Email)

Before:

ID   Name      Email
1 John john@example.com
2 Jane jane@example.com
3 Mary mary@example.com

After:

ID   Name      Email
1 John xxx@example.com
2 Jane xxx@example.com
3 Mary xxx@example.com
  1. Full masking with the RANDOM() function If you need to fully mask a column containing sensitive information, you can use the RANDOM() function to replace the data with random values. Here's an example:
UPDATE Users SET SSN = RANDOM()

Before:

ID   Name      SSN
1 John 123-45-6789
2 Jane 987-65-4321
3 Mary 555-55-5555

After:

ID   Name      SSN
1 John 980-45-6810
2 Jane 025-65-4391
3 Mary 352-11-8902
  1. Null masking with the NULL() function If you need to completely mask a column and replace all the values with null, you can use the NULL() function. Here's an example:
UPDATE Users SET CreditCardNumber = NULL()

Before:

ID   Name      CreditCardNumber
1 John 1234-5678-9012-3456
2 Jane 9876-5432-1098-7654
3 Mary 5555-5555-5555-5555

After:

ID   Name      CreditCardNumber
1 John NULL
2 Jane NULL
3 Mary NULL

Data masking offers several advantages, including:

  1. Improved data security: By masking sensitive data, you can prevent unauthorized access and protect confidential information from being exposed. This reduces the risk of data breaches, identity theft, and other forms of cyber attacks.
  2. Regulatory compliance: Data masking helps organizations comply with various data protection regulations such as GDPR, HIPAA, and PCI-DSS. These regulations require organizations to safeguard sensitive information and prevent unauthorized access.
  3. Flexibility: Data masking allows you to modify the level of masking depending on the sensitivity of the data. For instance, you can partially mask some data, fully mask others, or leave some data unmasked. This gives you more control over the level of protection applied to different types of data.
  4. Cost-effectiveness: Data masking is a cost-effective way to protect sensitive data. It is much cheaper than implementing physical security measures such as access controls, surveillance systems, and security personnel.
  5. Testing and development: Data masking enables developers and testers to work with real data without compromising its confidentiality. This allows them to perform more accurate and meaningful tests and simulations.
  6. Data analytics: Data masking can be used to generate realistic test data for data analytics and machine learning purposes. By masking sensitive information, you can preserve the statistical properties of the original data and prevent bias in the analysis.

Overall, data masking is a critical security measure that helps organizations protect their sensitive data from unauthorized access and comply with data protection regulations.

--

--

M.F.M Fazrin
M.F.M Fazrin

Written by M.F.M Fazrin

Senior Software Development Specialist @ Primary Health Care Corporation (Qatar)

No responses yet