Handling Private/Personally Identifiable Information

Last updated on
20 September 2016

When working with certain kinds of sensitive data, it is important to carefully evaluate Drupal's handling of that information and determine whether it meets your needs. As you might expect, the relative level of scrutiny you should devote to Drupal's handling of your data depends on the purpose of your site, its goals, and context. For example, on a site that provides information to groups persecuted by a government it could be considered inappropriate to store the IP address of a user, while on a site that includes health care information, a requirement may exist that you save the IP address of every user for a specific period of time. To match your site's purpose, goals and context, you will need to verify that Drupal is storing or omitting these data as appropriate.

Below are some general guidelines to consider when building/developing a site.

Consider what data in your site is private or sensitive

Review all the fields collected by a site. It can be easiest to do this by looking at the table definitions and some data from the tables in your database. As you audit each row, consider:

  • Should we even collect this information? You may find cases where Drupal is collecting information that you don't want it to.
  • What handling practices are required for this data, by 'best practices' or regulation? Personally identifiable information, health care data, financial data, and other classes of data may have specific requirements.
  • What data are being collected outside of the database and what does that mean for the security of sensitive data? Load balancer or web server logs may, for example, collect information about IP addresses and information accessed which may be a concern in certain situations.

Consider encrypting/anonymizing private data

If a site has any data that requires special handling, consider when and where to encrypt the information.

  • Consider using HTTPS (or another transport layer control like VPN only access) to encrypt the data as it travels between the site and the browser.
  • Consider encrypting it inside the Drupal database (and storing the key somewhere safe) so that exposure of the database contents alone will not result in the exposure of this information. Note that it is also necessary to consider whether the key is being stored in a cache table in the database rendering this measure less meaningful.
  • If encrypting the data inside the Drupal database consider expiring and purging keys for data when they are no longer needed.
  • Consider sanitizing Drupal database dumps before distributing them to developers.
  • Consider whether sensitive data are best stored in a separate data storage location and accessed via an API that can be further secured rather than incorporated into the main site.
  • You may want to evaluate the use of whole-disk or folder-level encryption to secure your site's private data. If you do not have access to the physical storage medium on which your data rests, your hosting provider may be able to assist in ensuring that your disk, or virtual machine, is encrypted.