There are four ways we can imagine for data obfuscation to occur.
- Masking Data
- Redacting Data
- Anonymizing Data
- Concealing Data
Masking Data could be essentially thought of as all those XXXX marks you might see on a document. Masking generally does not intend to obfuscate the entire nature of the data you are targeting. Masking Data could be appropriate if the data is sensitive but not classified.
final verdict: Masking leaves some clues to what the data was.
Figure 1: See an Example of Credit Card Application Masked.
Redacting Data is generally what we see done on government documents where an entire paragraph might be blacked-out. This technique seeks to hide any clues as to what the data might have been. This is good approach for big blobs of Data (Big Chunks) that are classified for national security purposes or compliance purposes. compliance use-cases can be SOCKS, HIPAA and other banking or customer privacy concerns.
Final Verdict: This is the most secure and robust type of data obfuscation because it leaves very little clues about the data.
Figure 2: A Credit Card Application redacted.
Anonymizing Data is best intended for situations where one intends to save the appearance or format of the data but not divulge the actual data in question. Anonymizing Data might be useful for forms that have discrete fields. Imagine a Credit Card Application that has FirstName, LastName, Address, Date of Birth, Address, Social Security Number, Previous Address, Mother’s Maiden Name, and Gender as possible fields that need to be obfuscated. It is perhaps prudent to replace some of the fields like FirstName, LastName, Address and Social security with other random pieces of data and leave other fields intact. This way the Data might still look realistic in a presentation but does not reveal the identity of a person. This is an approach that might be useful if one is trying to share some data with support specialist. This is a very good option if you only trying to partially hide sensitive data. That is only if limited number of fields are sensitive.
Figure 3: Credit Card Application anonymized
Concealing Data is essentially just creating an illusion that the data was never there. Imagine if you could completely delete a paragraph from a document and pretend it never existed. An Example of Concealment would be deleting the Driver License field on a PDF form; That’s to prevent any clue that identity block was even there..
Figure 4: Credit Card Application with Concealed data fields
Figure 5: What the Original Credit Card Application might have been before fields were obfuscation.