• Ei tuloksia

In classification the goal is to find out the classes, defined according to the cluster found earlier, in which the input data belongs to. The input data can be, for example, numeric information about the energy consumption. From that data pattern are looked. The patterns are features that are common to all inputs, but which differ a bit in every class so according to these features the input can be classified to one of the classes. In Figure 4 is shown how an input is classified to correct cluster found in Figure 4.

Figure 4. Classifying input.

13

For classification there are several different methods that can be taken into consideration. For example methods like NN, Probability Density Functions (PDF), Markovs Model (MM) or AdaBoost could be used. Of course the selection of used methods depends on the problem they should solve and the information that has been given. Each of the methods requires training before they can be used efficiently to detect the wanted patterns correctly. Training means that to the classification system a set of input data and the corresponding outputs, which the system should get, are given.

Using this data the classification system learns how to get the correct output from the given input. Therefore preliminary work should be done to recognize the prober classes to which classify the gathered data and to train the systems.

For this work the most interesting methods are NN and PDF. Reasons to select these methods are the vast possibilities that NN provides and the simplicity of PDF. With NN many kinds of input data can by classified with high accuracy if the structure of the network is correct and PDF provides straight forward methods for even complex tasks.

Therefore, a closer look to these methods should be taken.

3.2.1 Neural Network

Neural network (NN) or artificial neural network (ANN) is created to implement the functionality of the human brain. One of the best know ANN models is the multi-layer feed-forward neural network (MFNN) that consist of an input and a output layers and between them hidden layers. The knowledge of the MFNN is stored to the weights between the layers and with training the weights are set to the proper values to solve the certain problem. [9]

Qin, Ewing and Thompson compared in their work [9] the ANN and the autoregressive integrated moving average (ARIMA) model that is often used with a time series data. In the work they made a short period forecast of the wind speed using a ARIMA and a recurrent neural network (RNN), shown in Figure 5. They wanted to find out if the RNN could be used efficiently to forecast the time series data. In the work they

14

forecasted the wind speed only 15 minutes ahead using samples from five different heights and 30 periods. As a result they found out that the RNN gives better result than the more often used ARIMA model.

Figure 5. A recurrent neural network [9].

Cubiles-de-la-Vega, Pino-Mejías, Pascual-Acosta and Muñoz-García described in their work [10] a method to design a multilayer perceptron (MLP) with the information from a ARIMA model. They used the MLP to forecast the time series with a data set of population in the Andalusia and the hotel occupancy. In both cases the results of the MLP were better that the ones achieved using the traditional ARIMA model.

3.2.2 Probability density function

The Probability density function (PDF) shows the likelihood that a variable takes a certain value, for example the likelihood that the input variable of the system is 1. PDF can be used for forecasting [11, 12, 13] and for fault detection [14]. The method seems quite promising when thinking about how to realize the system of thesis. For forecasting the PDF has been used for the wind power [13] and for the power distributing systems [11, 12].

15

Heydt, Khotanzad and Farahbaksshian proposed already in 1981 a method [11] how to use a PDF for forecasting a power system load. They used the Gram-Charlier series type A for calculating the PDF from forecasted moments. For forecasting itself they used a time series or a least-square method (LSM). The forecast methods were tested with a short term forecast (1h) and over a longer period (20h). When used a shorter time periods (<2h) the time period approach seemed to be more precise, but over a longer periods the result where better using LSM. It is a bit strange to find an over 30-years old research that handles the very question that are needed to solve the current problems. It might have been toughed that have been in more wider and public use, but it does not seem so. Still the results they got were quite promising as the error of mean consumption was less that 3% half of the time.

In 1999 Charytoniuk, Chen, Kotas and Olinda did a similar research [12] as in [11].

They used the information of consumer consumption and the weather temperature as parameters. They also used a nonparametric approach, where data is not assumed belonging to any particular distribution, because that way there were not any need for a statistical analysis of the data. In the research the relative root mean square error (RRMSE) was less than 0.2 when forecasting the demand of 20 households over one month. When the same test was done with commercial customers, factories/plants, the error was closer to 0.1 even with only 5 customers. In the work it was also said that the accuracy of the method depends on the way customers are classified and the amount of the customers. [12] The energy consumption of the industry is more static and easier to predict than the consumption of households.

The previous paragraphs have shown that the PDF can be used to forecast the whole energy consumption, but it can be used similar way in a smaller scale also. Taylor, McSharry and Buizza presented method of using the PDF for forecasting the wind power [13]. When thinking about the project, this thesis is part of, this is quite interesting, because the smart energy grid is not only the main power lines that come to home, but also the smaller systems that can be connected to the network. This includes the wind power farms and also the individual windmills people can build on their own

16

use. So the PDF could be used to determine, when the windmill produces enough power to be useful. In the work they used data from five wind farms in United Kingdom and the weather forecasts from European Centre for Medium-Range Weather Forecast (ECMWF). The wind speed was measured from the height of 10m, which seems a bit low altitude. That altitude is below the tree line and usually, at least in Finland, the windmills are much higher to capture the optimal winds. As the data they used the daily wind speed from past 8 years. When the methods were tested the results were at least interesting. When forecasting past 2 days the 5-year mean of the windmill power production gave as precise answer as any other method used. So if we are not interested about the production of tomorrow, but want to see further, the mean of the earlier years can provide us all the information we need. In a way this is not a surprising result. The weather is a variable that is extremely difficult to forecast, but if we look it over a under uncertainty [14] and some of their modified version, which include more complex approaches like the fuzzy and the neural networks. [15, 16] Beside of the classical decision making theories a look to the rule engines is taken. The expert systems are not discussed in this chapter because they include same decision making methods as rule engines, but they are considered later in Chapter 6. [17, 18]

3.3.1 Decision making theory

How to make decision between the possible outcomes? How to take the uncertainty of the life into consideration and still find the best alternative from the given choices?

Kozine [14] gives us a simple overview of the non-conventional approaches to decision