Tuesday, December 20, 2016

Privacy-friendly Forecasting for the Smart Grid using Homomorphic Encryption and the Group Method of Data Handling

Machine learning has become more and more popular last years. Every day new applications appear on the market such as augmented reality, pattern recognition, intelligent search, personal recommendations, forecasting etc. Big companies like Google, Amazon, Apple use comprehensive statistical models to improve their business. They rely on immense amounts of data that clients provide through different on-line services. This information can be very privacy sensitive and causes immediate damage after unapproved disclosure.
The question, then, arises: how can private data be encrypted and managed at the same time by machine learning algorithms? In our recent paper we tried to answer this question for the load forecasting problem in the smart-grid using homomorphic encryption.

The idea behind the smart-grid technology is very simple. It assumes that a client housing and a service provider are equipped with communication devices called smart-meters. These allow to transmit current information about load consumption, weather conditions, current financial information, structure of a local utility grid etc. A service provider can apply statistical techniques in order to forecast energy demand for a next period of time. Having predictions clients might control their consumption more efficiently and public utilities may optimise production and logistic costs to lower prices or avoid blackouts.

One popular class of prediction algorithms is based on artificial neural networks. ANNs compute the so-called activation functions, in practice it is common to use a sigmoid function where the logistic function $t \mapsto 1/(1 + e^{−t})$ is a popular choice. Due to the highly non-linear nature computing such a sigmoid function homomorphically is far from practical. To overcome that issue one can substitute a sigmoid by a polynomial function that was done recently in papers of Xie et al. and Dowlin et al. for pattern recognition purposes. But there exist an older statistical approach built especially for prediction purposes by a Soviet computer scientist Alexey Ivakhnenko in 1970. It is called the Group Method of Data Handling (GMDH).

GMDH takes a history of the system outcome $\{x_i\}$ and construct truncated Wiener series to approach the next output value

$a_0 + \sum_{i=1}^{n_0} a_i x_i + \sum_{i=1}^{n_0} \sum_{j=i}^{n_0} a_{ij} x_i x_j + \sum_{i=1}^{n_0} \sum_{j=i}^{n_0} \sum_{k=j}^{n_0} a_{ijk} x_i x_j x_k + \dots$

In the original paper this polynomial is called a Kolmogorov-Gabor polynomial and it can be nicely illustrated in a neural network manner

where each node has only two input edges and its own activation function described by a quadratic polynomial over real numbers

$\nu_{ij} : \textbf{R}^2 \rightarrow \textbf{R}: (x, y) \mapsto b_{ij0} + b_{ij1} x + b_{ij2} y + b_{ij3} xy + b_{ij4} x^2 + b_{ij5} y^2.$

The learning phase determines the polynomial coefficients and the wiring of nodes while the number of layers and the number of nodes at each layer are fixed in the beginning. Actually, one should solve a linear regression problem at each node and decide whether this node is among the "best" ones in a layer in terms of error performance. As a result a GMDH neural network can be expressed by an arithmetic circuit of multiplicative depth equal to the number of layers. This construction is a perfect match for homomorphic evaluation.

To perform experiments on real data we found a source provided by Irish government over 5000 housings and commercial buildings. It turns out that the prediction performance of ANNs for individual housings is quite low. But combining a few houses makes the forecasting error decrease drastically. Thus we considered aggregated consumption for 10 house combinations which correspond to small apartment complexes where privacy issues still make sense.

Finally, we applied an FV SHE scheme on top of the forecasting algorithm together with a fixed-point representation of real numbers recently studied by Costache et al. The simplest GMDH network with 4 layers was chosen to show reasonable error performance. The resulting algorithm evaluates a load demand for the next 30 min with relative error around 20% which is comparable with modern ANNs. The computation of one prediction takes 90 seconds in the sequential mode or 4 seconds in the parallel implementation on an average laptop equipped with an Intel Core i5-3427U CPU (running at 1.80GHz).

Please, check http://eprint.iacr.org/2016/1117 for further details.