M ODEL P REDICTIONS - E VALUATION M ETHODS OF P ARAMETER E STIMATION

PART III: DEVELOPMENT AND EVALUATION OF MODEL IDENTIFICATION

9. PARAMETER IDENTIFICATION

9.3 E VALUATION M ETHODS OF P ARAMETER E STIMATION

9.4.2 M ODEL P REDICTIONS

To evaluate the entire model identification scheme, the model predictions need to be analysed, as discussed in Section 9.3. Such analysis is particularly important for real networked systems whose true topology and true model parameters are unknown. Here we consider the model predictions using synthetic data like real data to have later a reference to which model predictions obtained with real network data can be compared. In six cases of parameters, Figure 9.4 shows prediction results for state 1 for each node with estimated Ising models. The figure also shows linear re-gression lines fitted to node state predictions and reference lines indicating optimal predictions.

Predictions are poor with a small , as the neighbouring nodes hardly interact, and conditional state probabilities are mostly determined by the random external loading. With a larger , the neighbours contribute more, and neighbourhoods are estimated more accurately, improving pre-dictions.

Absolute prediction errors and slope coefficients of fitted linear regression curves are shown in Figure 9.5 as functions of for all 21 model parameterisations using both estimated and true

Figure 9.4. Predictions with estimated (structure and parameters) Ising models. Node state 1 probability predic-tions are shown for each node as a function of data-calculated node state 1 probabilities. Predictions are shown in dots, linear regression lines fitted to predictions in solid lines, and reference curves of optimal predictions in dashed lines. Predictions from top-left to bottom-right correspond to the following six values: 0, 0.04, 0.08, 0.12, 0.16, and 0.20. Calculation: of three ensembles, shown are those that correspond to minimum average node state 1 prediction errors.

Ising models. For the true Ising model, the error is largest with a small and then constantly de-creases as inde-creases. For the estimated Ising model, however, the error grows as inde-creases, assuming the largest values at about 0.12. Consequently, the error alone seems a poor meas-ure for the goodness of the predictions, because they are clearly poor at some of the smallest where node-specific behaviour cannot be predicted.

With both the estimated and true graph structure, the slopes assume their smallest values at a small and then quickly rise close to one. However, with the estimated model, the rise happens at larger than with the true model. The slope coefficients have the problem that they ignore fluctuations taking place around the regression lines. Therefore, both errors and slopes must be studied to judge the goodness of predictions. According to errors and slopes, the best predictions are achieved with the largest values, though the best topology and parameter estimates are ob-tained at around 0.12. In Figure 9.4 again, deviations in average states between nodes are largest at medium coherence values, i.e., at around 0.12, which probably complicates predic-tion and explains at least the large predicpredic-tion errors.

9.4.3 Effect of Data Characteristics

This subsection studies the effect of data characteristics on model parameter identification with the same cases as the effect of data characteristics on topology identification was tested in Sub-section 8.5.4, exploiting the respective graph estimates as Ising model structures. Thus not only parameter identification but, in fact, the whole model identification scheme is considered here.

Again the type of node load distribution is first changed from that of the reference case’s Uni 0, 1 distribution to an 0.5, 0.25 distribution and then to an Exp 0.58 distribution.

Model parameter estimates corresponding to the topology identification results in Figure 8.8 are shown in Figure 9.6. Back in Subsection 8.5.4, graph estimates were nearly equally good in all three cases. However, according to the ASSMI, in the exponential distribution case, coherence is

Figure 9.5. Node state 1 absolute predictions errors (left) and slopes of fitted linear regression lines (right) as functions of . Results are shown for true (squares) and estimated Ising models (circles). Calculation: absolute prediction errors are medians over three ensembles, for which each error is calculated as an average over all nodes.

Slope coefficients are medians of the respective coefficients obtained with each ensemble.

greater at larger values, which probably explains the better parameter estimates obtained in the same case here with large values.

As Figure 9.6 shows, when the average node neighbourhood size is altered from the reference case’s 8.8 to 6.8 and to 10.8, model parameter estimates seem best at 6.8, where and are both quite good even at the largest values, and where fluctuations in are relatively small.

In Subsection 8.5.4, we saw that also graph estimates were better with a small at large values.

At 10.8, parameter estimates are the worst, even though with small values, they are as good as in the two other cases. Apparently, with larger poor graph structure estimates cause poor parameter estimates.

In Subsection 8.5.4, with the two largest data sets, 540 and 1080, topology estimates were slightly better than with the reference case’s 270. In addition, the two largest data sets differed only slightly in favour of the larger one. Their differences in parameter estimates are small, as shown in Figure 9.6. In the reference case, parameter estimates are only slightly worse

Figure 9.6. Effect of load distribution type (top row), node neighbourhood size (middle row), and data set size (bottom row) on parameter identification. Estimates (left), (centre), and (right) are shown as functions of . Top row: exponential (squares), uniform (circles), and normal (triangles) node load distributions. Middle row:

6.8 (squares), 8.8 (circles), and 10.8 (triangles). Bottom row: 270 (circles), 540 (squares), and 1080 (triangles). Calculation: measures are medians over the respective values with three ensembles.

than in the two other cases, the differences occurring mostly at some of the largest values. Be-tween 540 and 1080, parameter estimates are similar.

The effect of network size on topology estimation was tested in Subsection 8.5.4 with three dif-ferent schemes. First, 270 was used for each network size, 30, 60, and 120 nodes, and with the two larger networks, graph estimates appeared quite poor. The model parameter estimates corresponding to this case are presented in Figure 9.7. As expected, 30 yields the best parameter estimates, though the differences are small in particular between

60 and 120. When was increased linearly in in Subsection 8.5.4, with all three network sizes, topology estimates were almost equally good. As Figure 9.7 shows, also differ-ences in the corresponding parameter estimates are now smaller, even if slightly, than with a con-stant . Finally, was increased quadratically in , but was tested only with 60 in Subsec-tion 8.5.4, where the worst results were clearly those with 270, whereas the two larger showed slight differences. However, since parameter estimates here with all three data sizes are similar, the differences shown in graph estimates do not appear in parameter estimates. Yet, the

Figure 9.7. Effect of network size on parameter identification. Estimates (left), (centre), and (right) are shown as functions of . Top row: 30 (circles), 60 (squares), and 120 (triangles), each with 270. Middle row: 30, 270 (circles), 60, 540 (squares), and 120, 1080 (trian-gles). Bottom row: 270 (circles), 540 (squares), and 1080 (triangles), each with 60. Calcula-tion: for 30 measures are medians over the respective values with three ensembles. For other network sizes, measures are calculated from single ensembles.

results confirm our previous conclusions in Subsection 8.5.4 that must be increased at least linearly in for data to be representative.

Finally, in Subsection 8.5.4, the quality of the data used in the analyses was tested with a network of 30, by studying its topology identification with three MCMC burn-in periods, 250 30, 500 30, and 1000 30 steps. Topology estimates turned out similar in all three cases. Figure 9.8 shows that this occurs also with model parameter estimates, which are similar for all burn-in pe-riods, thus confirming the quality of the reference data set.

Figure 9.8. Effect of MCMC burn-in steps on parameter identification. Estimates (left), (centre), and (right) are shown as functions of . Results are shown with the following number of MCMC burn-in steps:

250 30 (squares), 500 30 (circles), and 1000 30 (triangles). Calculation: measures are medians over the respec-tive values with three ensembles.

In document Data-Based Modelling and Analysis of Coherent Networked Systems with Applications to Mobile Telecommunications Networks (sivua 99-104)