Telekom Slovenije Interview: Making data science work for operators and subscribers

The collection, storage, and usage of customer data is a hot-button issue across the telecoms industry post-Cambridge Analytica, particularly within Europe following the introduction of the EU’s GDPR laws.

With the help of CRMT, Exasol’s partner for CEE region, Telekom Slovenije recently implemented Exasol to more effectively gather and utilise customer data, and DT Editor James Barton spoke with the operator’s Head of Data Management Simon Brmez and Development Technologist Borut Rozac to discuss the practicalities, necessities and ethics of customer data.

How was Telekom Slovenia collecting and storing data prior to its decision to use Exasol solutions? Why did you feel the previous approach was ineffective?

Simon Brmez: We were using a traditional data housing approach, but discovered that this was not very suitable for data scientists – they usually have a more technical background and can do a lot by themselves, which is the type of privilege that we don’t afford to ordinary business analysts. For example, data scientists like to use open source platforms, like R or Python, and they prefer to analyse massive, detailed amounts of data and look for correlations between them. This led us to believe our data housing approach was not suited to this type of work – it’s a question of approach, modelling, and the privileges that analysts have on the database.

Data scientists usually need privileges to create temporary tables and transfer massive amounts of data by themselves; it’s not just a technological question, but also an organisational one. Exasol was a solution that addressed all of this; it supports R and Python as in-database functionalities, and it’s an isolated sandbox environment for data scientists that allows us to be a lot more flexible on the privileges that we grant.

How has this enabled you to improve your operations?

Borut Rozac: We have many different data sources – from IoT systems to location-based services, the amounts of data involved are huge and it’s not possible to store everything directly to our data warehouse. We perform all the filtering and aggregate functions directly on Exasol’s database to avoid loading our data warehouse with these operations. We also use many open source tools, and we run this directly on Exasol’s database instead of transporting data directly to the application, saving some time on this aspect of data analysis.

Operators increasingly feel that they are not monetising their data effectively – is this a regional trend within CEE or a wider global issue?

BR: Monetising telco data was quite a new idea, at least in Slovenia. There are a lot of restrictions in terms of GDPR; we’re trying to sell some data, but we’re also doing a lot of internal monetisation. With traditional business intelligence, they just include reports on sales data without including data from applications and other locked data.

GDPR requires that all data is as anonymous as possible, although if it’s for our internal use or some other use that falls within the right parameters, we can run analysis without limitation. For anything else, we have to collect approvals from customers in order to analyse and monetise their data.

Is this something that people are generally happy to opt into?

SB: If a customer receives a service in exchange for opting in – perhaps something location-based – then they tend to approve, but it needs to have a clear benefit and a good reason for using their data. If for example an app asks for their location in order to show them where nearby amenities are, that delivers a clear benefit. However, if an app requires their location in order to sell them insurance, they might disagree with sharing their location with insurance companies and therefore refuse the app. Approvals are therefore critical.

Are people only just becoming more aware of this issue?

SB: It’s relatively new. People have been giving away data to Facebook and Google for years now, but it’s only in recent months with the Cambridge Analytica scandal that people have woken up to the fact that all their information is being stored and sold for profit – and that this could potentially create problems for them if a potential employer were to look them up on Facebook. GDPR wasn’t written due to this particular issue with Facebook and Google – it was planned years ago, but its introduction has coincided with this scandal and this has raised awareness of the issue.

Going forward, will collection and storage of data be affected?

SB: It already is – we can only collect information that we explicitly need for our business, for example for billing. We can’t store any personal data without the customer’s approval – anything else has to be anonymous.

Do you think anonymous storage is the only approach that will be viable?

SB: No – there are other approaches. Before putting the data into absolutely anonymous form, we can work out a ‘cluster’ that a customer belongs to. This would only use basic demographic data – age or gender groups, for example – and then the cluster type could be used to work out some typical behaviour patterns, i.e. a customer in this group is likely to do this particular action. This isn’t 100% anonymous but you’d need a lot more time and computing power to calculate whose data it was in real-time.

Telekom Slovenije Interview: Making data science work for operators and subscribers