Skip navigation

3. De-identify as soon as possible

Whenever possible, you should remove personal identifiers from the rest of your data. There are various techniques of de-identification / anonymisation. Two common techniques are:

(Click on each heading for further elaboration) (10,11)

PSEUDO-ANONYMIZATION

In this technique, the personal identifiers for each subject (e.g. name and contact information) are substituted with a unique code (commonly a unique research ID).

The ‘key’ that links each code to a subject’s identifiers is locked away separate from the rest of the data. Any information that is subsequently collected about the subject will be labelled with the unique code.

For example:

‘Ali bin Abu, Hospital number: T12345, DOB: 1 Jan 1940’ Research ID: A001

GENERALISATION

For example:

  • Date of birth can be replaced with age (e.g. DOB 1 Jan 1940 Age: 79)
  • Numbers can be replaced with a band (e.g. Income: RM1000 Income: Low);
  • Names of places or people in an interview transcript can be replaced with a general description about them (e.g. Shop XYZ  a grocery store).


The risks of re-identification will vary with the techniques used and the amount of personal identifiers that are removed. The more identifiers removed, the lower the chances of re-identification. However, this should be balanced against the loss of usefulness of any remaining data.

For more methods of de-identification, you can refer to:




4. Store and transfer securely

ICO. Anonymisation: managing data protection risk code of practice [Internet]. Information Commissioner’s Office; 2012. Available from: https://ico.org.uk/media/1061/anonymisation-code.pdf

Office for Civil Rights (OCR). Methods for De-identification of PHI [Internet]. HHS.gov. 2012 [cited 2019 Feb 1]. Available from: https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html