/ big dataPart I

TIME FOR CONNECTED DATA

Data Availability

We live today in a data-driven world. Some researchers even call it „data age“ [1] due to the fact that over the past 30 years data has become indispensable to all aspects of human life, be it on a business or on a personal level.The number of connected consumer devices has already superseded the number of human beings on earth [2] which made the number of digital interaction channels dramatically increase. And with the coming tidal wave of devices from the Internet of Things (IoT), more and more sensors of all kinds are going to further generate consistently growing datasets

Whether production machines or smart devices, all of them create, capture, store, replicate and share data which get increasingly exploited through statistical analysis and AI algorithms, contributing to the “global datasphere” on our planet [1:1].

This tight integration of data into our every day life has been essentially made possible by two facts:

  1. the substantial technological developments in computational power, memory, storage and software in conjunction with the falling hardware prices have drastically accentuated the growth of capacity for collecting, storing and processing all kinds of data in all industries. The contribution of community-supported open source software projects greatly accelerated this movement and led to a broader and easier access to relevant data technologies. Two more technological aspects have also been instrumental in the large availability of data: a) the increase of network capabilities including bandwidth and distribution and b) the ability of embedded devices to produce, share and transmit data. This in particular made it extremely easy for anyone to produce and share digital content on a large scale.

  2. the social acceptance of data driven technology to increase live comfort and optimise businesses. Communicating via email, social media or searching for information in the internet has become part of our everyday life. According to the global digital report [2:1], more that 4 billion people are using the internet and over 3 billion are social media users. You don’t even need to be a computer expert anymore to implement some simple data pipelines on your own, like collecting and analysing health data from sport exercises.

Data Connectivity

The current amount of accumulated data is huge and there are a lot of references regarding its estimated size [3]. Some studies emphasise that in the last two years 90% of the wold’s data since the digital revolution in the 70th has been created [4] and it is to be doubled every two years.

Image Description

Data can be divided in 5 essential categories, as there are many types of devices and processes generating data:

  • Transactional data mostly represents structured business documents, involving everything from a purchase order to a good delivery note or customer master data.
  • Productivity data is made of semi-structured files coming from traditional computing platforms like servers, laptops, tablets, phones and similar.
  • Social data is essentially unstructured data which social media user shares like their socio-demographics, location, reviews or posts on various platforms.
  • Multimedia data is created and consumed either for entertainment purposes like movies or music and for non-entertainment purposes like video monitoring.
  • Embedded data is produced by smart devices and various chips from the IoT like wearables.

Predictions mostly agree that the overall data size will increase and some forecasts show that, while the amount of multimedia data will remain the most substantial part of the “global datasphere”, the proportion of data from embedded devices will grow to reach 20 percent by 2025 [1:2]. Just like we saw the personal and business engagements in multimedia and social media excessively grow in the last couple of years, the broad deployment of embedded devices will lead to a entirely new level of personal and business interactions with data.

With the increasing amount and diversity of data sources most organisations will face the need to connect pools of data that never needed to be brought together before. Data typically collected in silos must be shared and connected to get to new insights and enable new personalised products and services. Connected data is becoming a fundamental concept for achieving the deep contextual understanding necessary to provide highly personalised and relevant engagements as required in today's digital world. Mastering the connection between datasets from all the categories described above will be key to meet customers expectations in the future. The resulting impact on people’s daily life will be profound as data intelligence will become inherent to their environment.

As companies are putting in place systems to ensure data connectivity, they also need to ensure that this augmented data is made accessible throughout the enterprise for usage in analytics and advanced AI systems, timely and compliant with corporate and legal constraints. Technologies like machine learning or natural language processing make it possible to augment the scope and frequency of data analysis and even support real-time decision making and interactions as well as automation. While those capabilities combined with the depth of connected data open virtually unlimited fields of application, it will be necessary for enterprises to internally develop and foster a connected data mindset and facilitate collaboration across departments and their ecosystem in order to best exploit the promises of the connected data world. And this will need to happen in compliance with restrictive corporate policies and legal requirements on data usage.

You can find the second the part of the blog about connected data here.


  1. International Data Corporation. Data Age 2025: The Evolution of Data to Life-Critical Don’t Focus on Big Data; Focus on the Data That’s Big. Source: https://www.seagate.com/de/de/our-story/data-age-2025/, visited on 11.04.2018 ↩︎ ↩︎ ↩︎

  2. We Are Social Deutschland GmbH. Global Digital Report. Source: https://wearesocial.com/de/blog/2018/01/global-digital-report-2018, visited on 11.04.2018 ↩︎ ↩︎

  3. Digital Financial Reporting. Need for New Global Standard Spreadsheet Alternative. Source: http://xbrl.squarespace.com/journal/2014/5/3/need-for-new-global-standard-spreadsheet-alternative.html, visited on 11.04.2018 ↩︎

  4. FST Media. 90% of the world’s data was created post-2015: PwC. Source: http://fst.net.au/news/90-worlds-data-was-created-post-2015-pwc, visited on 11.04.2018 ↩︎

    Olga Mordvinova

    Olga Mordvinova

    CEO at incontext.technology, software engineer with background in distributed systems, information retrieval and analytics. Passioned to connect use cases, data, technology and effective interaction.

    Read More