CERN Data Storage Rules (in draft)

2013/06/14 CSO

These subsidiary rules to Operational Circular Nº11 provide rules on how to store Data at or by CERN in order to guarantee consistent handling of Data and the application of appropriate safeguards across the Organization.

Rules

The stipulations in paragraph I.2 of OC11 imply that any external storage service (e.g. "Cloud" service) used by a CERN Collaborator for official work must also be in compliance with OC11. Where a CERN Contributor shares data with a third party, it is his responsibility to communicate the contents of the DPP and make all reasonable efforts to have the third party respect the principles of this policy.

In order to be considered "adequate storage facility" by the Data Protection Officer (OC11 III.C.17), the following rules need to be fulfilled by the Data Processor (i.e. the data store manager):

  • All Data shall be stored such that proper access protections are guaranteed. If anonymous write access is granted, than read access must be sufficiently restricted to prevent abuse. The uploaded content must be regularly monitored (e.g. for "drop-box"-functionality).
  • By default, access protections of newly introduced Data must be equal to that of similar Data stored, or be restricted to the Data Controller only.
  • When being stored or in transit, Sensitive Data shall be encrypted using suitable encryption methods to preserve the data integrity. The sole exception is where the data store or the transit is physically isolated within the CERN Computer Centre and digitally secured by qualified CERN IT personnel.
  • Access to Sensitive Data, i.e. date & time of each access as well as the accessing person, must be logged for traceability reasons. The Data Processor must be able to provide at any time a list of people who have privileged access to Sensitive Data (e.g. system administrators).

The following data stores are currently considered to be "adequate storage facilities" for Sensitive Data:

  • All PCs and laptops whose harddisk(s) are encrypted as described here for the MS Windows or Apple Mac operating systems, respectively, or employing dedicated encryption software like e.g. Gnu PGP for the Linux operating system.

The following data stores are currently considered to be "adequate storage facilities" for Restricted Data:

  • (TBD)

By definition, any data store is adequate for Public Data.

Data Retention Rules

  • Sensitive Data shall be retained as follows:
    • Personal computing data like passwords, private folders, emails shall be retained for at least six months and purged not later then two years after the end of the affiliation of the person concerned, in accordance with the standard grace periods.
  • Log data shall be purged after one year unless it is to be archived in which case it shall be anonymized or consolidated. For the purpose of this rule, service actions and audit trails are not considered "log data".
  • For all other data, Data Controllers are encouraged to define appropriate data retention practices.

Comments

  • For archival purposes, scanned or electronic documents should be maintained in "PDF/A" format.
  • If the title of data is also Restricted Data or Sensitive Data, it must be protected separately. For example, a file name (i.e. the title of certain data) might be classified as Sensitive Data and, thus, must be protected at the folder level.)
  • Access to "Restricted Data" is not limited to CERN MPs; access permissions might include external partners.
  • Who owns archived data, ophaned files (e.g. web sites, e-group archives)?
  • Another question that was raise with me is in the area of impersonation or manipulation of access rights. Certain members of IT-OIS (and possibly others) will have the ability to impersonate the credentials of anyone at CERN. Those running Active Directory have the ability to make themselves a member of any access group (consequently giving them access to any service). In the interest of transparency, this type of access ought also to be declared.
  • What about this situation: Consider the Pay system or the EDMS conversion system. These services run on our machines, but the users (the GS service providers) build systems and scripts to do the job for the end users (generate the pay PDF). In doing so they may (or may not) generate logs and temporary files. These files may be stored on our machines. We may know about them if we had to deal with them. But we may also ignore their existence if the system never had a problem for example. For these items, we can certainly mention them if we know about them, but who’s responsibility it is to deal with them and make sure they are handled following the directives?
  • If a data store depends on other stores (like DFS, AFS or tape storages), the Data Processor must also document how access rights are inherited and whether this inheritance is enforced.
  • Access permissions given to a CERN user without (re-)authentication might be treated as anonymous access in certain circumstances.
  • Administator access to a server or application should not imply full access to the data.
  • The Data Processor must declare the default access protection level of data stored by the Data Controller.
  • DCP: Do we need to water-mark documents? or is this covered by "The Data Originator is responsible for defining and tagging the classification level of their data."?