Over the last decade more and more companies have realised that data has an intrinsic value and is not merely a byproduct of a business activity or process. Whether transactional, customer interaction, social or sensor data, companies are capturing and managing an increasing number of valuable data sets which might be of some interest for cyber criminals.

As a result of this, considering data security has become paramount to the drastically increasing number of data teams and projects. Omitting to consider data security as a priority might lead to a business disruption or damage of reputation as well as jeopardise legal compliance to the General Data Protection Regulation[1] (GDPR), which entered into force on May 25th — the GDPR is exactly aiming at a better protection for all EU citizens from privacy and data breaches.

I had the opportunity to address this topic in a fruitful group discussion in the form of a World Café[2] at the Rethink! IT Security 2018 conference[3] with the following title: effective digitalisation as an interplay between data usage and cybersecurity[4]. I want here to gratefully thank all participants as well as we.CONECT[5] for organising the conference.

We started the discussion with three questions from which we developed some interesting lines of thought which I want to share with you in this post. They can help data teams and their stakeholders appreciate various aspects of data security in the context of their data projects.

The first question dealt with the meaning of enterprise big data under the perspective of data security. With the second question we investigated the obstacles that organisations and data scientists must overcome in the context of data security. The third question focused on ways to ensure data relevance for a successful digitalisation and meaningful application of data science.

Organisations may have different aims in gathering data. Participants have been especially considering the following use cases: behaviour analysis, personalisation, cross-selling, quality assurance, e.g. predictive maintenance, and operations, e.g. system log analysis.

We can classify the data security measures to solve the challenges they encountered as follow:

  1. Protection of data assets from unsolicited or inappropriate access
  2. Compliance to restrictive data regulations with challenges like data anonymisation, sensible data usage for research or creative analysis with extension of the original purpose of the data
  3. Agile data science where speed in data availability is required
  4. Discovery of data sources already available to the organisation
  5. Organisation structures as an obstacle to data usage

From the shared experience we identified some initiatives which can be taken to help organisations deal with those concerns:

  • Ensuring data governance with an appropriate system of values. These activities might include the classification of information available to organisations in the form of a data catalogue, appointment of data owners who monitor and control data source usage, as well as collection of data restricted to what is relevant to specific business purposes.

  • Grow legal thinking among the data scientists and data engineers. Data scientist should feel compelled to document the data source they use and verify their legal status. Furthermore GDPR compliance need to be validated and regularly reviewed, especially against any unnoticed purpose extension.

  • Ensure close collaboration between data, business and IT-security teams. Appropriate measures for that might include maintaining guidelines for secure data usage, clarifying the business interests of data sets with a larger involvement of business departments, or establishing proper data access monitoring.

  • Risk analysis applied to data organisations and projects. Regulate data usage within organisations and also per project through a proper modelling of risks for the affiliated data activities and data sources. Assess data protection requirements on a project-basis and discover necessary data categories, ascertain mandatory preparation steps and justify legitimate business interests for the usage and correlation of various data sets.

  • Automate data preparation and transformation. This can also contribute to a better control of data usage as well as reduce risk and integration efforts for the data teams.

Do not hesitate to share this post with data teams and organisations. Although there is no such thing as 100% security, the findings described above might help increase awareness and get better at securing data assets while still ensuring creativity, speed and agility in data projects.

  1. Nate Lord. What is GDPR (General Data Protection Regulation)? Understanding and Complying with GDPR Data Protection Requirements. Source: https://digitalguardian.com/blog/what-gdpr-general-data-protection-regulation-understanding-and-complying-gdpr-data-protection, visited on 30.05.2018 ↩︎

  2. The World Café. World Cafe Method. Source: http://www.theworldcafe.com/key-concepts-resources/world-cafe-method/, visited on 30.05.2018 ↩︎

  3. we.CONECT. Rethink! IT Security 2018 conference Agenda. Source: https://rethink-it-security.de/agenda/, visited on 30.05.2018 ↩︎

  4. Cyrille Waguet. Big Data Security Café | Effektive Digitalisierung als Zusammenspiel der Datennutzung und Cybersicherheit. Source: https://rethink-it-security.de/sessions/world-cafe-4-big-data-security-cafe-effektive-digitalisierung-als-zusammenspiel-der-datennutzung-und-cybersicherheit/, visited on 30.05.2018 ↩︎

  5. we.CONECT. Homepage. Source: http://we-conect.com/, visited on 30.05.2018 ↩︎