Data minimization on dev / staging environments for PII [#2971799]

Personal Identifiable Information (PII) should not be stored on non-production environments, therefore we need a mechanism to mask the data for non-prod while ensuring it's still meaningful.

What is the best approach to deal with this in Drupal?

Post initially from @Dubs #2848974-9: Privacy Concerns as GDPR Compliance

Comments

Comment #1

9 May 2018 at 19:56

mgifford created an issue. See original summary.

Comment #2

gisle

he/him

Norwegian Bokmål

Norway

commented 11 May 2018 at 12:37

Isn't this a duplicate of #2971800: Pseudonymisation - Separating PII data from non-PII data?

Can they be merged and one of them closed?

Comment #3

mgifford

he/him

English

commented 13 May 2018 at 15:12

Possibly. I'm fine with going that way.

I see this more about the process to move content between production & other environments. I see #2971800: Pseudonymisation - Separating PII data from non-PII data to be more about identifying the information which includes PII vs that that doesn't.

They are certainly related.

Comment #4

gisle

he/him

Norwegian Bokmål

Norway

commented 13 May 2018 at 15:45

It was only a suggestion - it is your call.

My thinking was this: If you have implemented #2971800: Pseudonymisation - Separating PII data from non-PII data, there will be no problem exporting data from your production environment. It is already pseudonymised, so you can safely export it (provided the database with the PII data is secure, and not exported along with the pseudonymised data).

I.e.: Moving data between production and other environments is just a special case of the privacy issues that pseudonymisation is intended to solve.

Comment #5

mgifford

he/him

English

commented 14 May 2018 at 13:10

It will be encrypted on production site, but pseudonymised when exported to staging/dev sites, right?

Maybe I just don't understand the process. But I see them as two different things.

How you export the data so that you have a meaningful replication of production is different than the stage of identifying & encrypting PII.

But ya, maybe I'm getting this wrong.

Comment #6

gisle

he/him

Norwegian Bokmål

Norway

commented 14 May 2018 at 14:37

It will be encrypted on production site, but pseudonymised when exported to staging/dev sites, right?

Well, at least that is not how I think about these things.

Here is a brief description of what our current system does:

On the production site we simply replace all items classified as Personally Identifiable Information (PII - i.e. names, phone numbers, addresses, credit-card numbers, etc.) with a 128 bit Universally Unique IDentifier (UUID). Note that the UUID is a pseudonym, it is not an encrypted version of the PII.

Then, in a second production database, we store records that links all UUIDs in the system back to its corresponding cleartext PII. To add another layer of security, you could encrypt this second production database - but according to our DPIA, this is not necessary for us (YMMV), so we don't use encryption.

To export the data (for all purposes) we simply export the pseudonymised database we use in production. The second database (the one linking UUIDs to PII) is not exported, but kept in a very secure location. This means that no-one with access to the exported data will be able to go from the UUIDs they have access to, to PII.

As a nice bonus, this arrangement also let us comply with the right to erasure. When a data subject exercise his or her rights according to Article 17, we just delete the single record in the second database where the relation between the PII and the UUID is stored. Now, everywhere else in our system, what remains is just the UUID (it is no longer a pseudonym because it cannot be linked to the PII – so it is no longer personal data).

How you export the data so that you have a meaningful replication of production is different than the stage of identifying & encrypting PII.

It may be different, or it may be same, depending on what means you use to solve the problems.

Comment #7

mgifford

he/him

English

commented 14 May 2018 at 15:58

Status:

Active

» Closed (duplicate)

Let's mark this one as a duplicate then. Thanks for the description of how you've done it.

Data minimization on dev / staging environments for PII

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Related issues

News items

Our community

Documentation

Drupal code base

Governance of community