Neural N-Tree Algorithm - Open machine intelligence in education

Visualization of hyperdimensional data in cluster analysis is not a trivial task, and neither is visualization of the algorithm-generated model [120]. Approaches to vi-sualizing hyperdimensional data and the cluster boundaries can be found in the literature [120]. For instance, sub-space clustering reduces the dimensions to k;

hence, the data within the sub-spaces and the separating cluster boundaries can be efﬁciently visualized to the end users for smallk. However, the lower bounds of sub-space clustering is o(2^k

efﬁciently visualized to the end users for small efﬁciently visualized to the end users for smallk

) for both the required time and space, because an ex-ponential number of the sub-space combinations exists [121]. The exact sub-space clustering is intractable for large k’s for large datasets. Hierarchical clustering can be used to generate dendograms that visualize the relationships between the clus-ters in the form of a tree. However, the dendograms usually cannot be adjusted or changed after their creation. Dimensionality reduction methods such as principal component analysis (PCA) [122] can be used to reduce the number of the

dimen-sions before the clustering. Furthermore, self-organizing maps (SOMs) [123] (Figure 4.11) have been successfully used in both dimensionality reduction tasks and cluster analysis if u-matrix, which represents SOM in grayscale image, is used to visualize the resulting SOM.

Figure 4.11: Self-organizing map (retrived from https://towardsdatascience.com/self-organizing-maps-ff5853a118d4)

We decided to adapt the interpretable nature of decision trees and self-organizing maps to form a clustering algorithm that could be visualized ink =2 dimensions to address the difficulty of visualizing hyperdimensional data and the model of the cluster analysis. Furthermore, the resulting model should be both intuitive to ex-plain and adjustable. This means that the model generated by the algorithm should be modified and adjusted by the end user and, hence, satisfy the requirements of the Augmented Intelligence method (AUI).

The developed neural network type of clustering algorithm, named Neural N-Tree (NNT), has an internal structure of a balanced binary tree. Each node in an NNT contains a codebook vector in hyperdimensional space, and each terminal node represents a cluster and contains a codebook vector. The data of the to-be-clustered dataset are traversed from the root node to one of the terminal nodes in NNT by comparing the child nodes of the current noden and selecting the child node

ArgMin(d(q,cle f t),d(q,cright)) ifdis a distance functionFand

ArgMax(d(q,cle f t),d(q,cright))

ifdis a similarity functionFfor the feature vectorqand the codebook vector of c_{le f t}andc_right.

Before the actual clustering in NNT algorithm, the NNT model is build by cre-ating a balanced binary tree forkclusters and initializing the codebook vectors for each node with random values. After the creation of the NNT model each node will be indexed withpost order traversalso that the terminals with the same parent will

Figure 4.12: Model of NNT

Figure 4.13: Dimensionality reduction of NNT

be indexed withn_{le f t} =k,n_right =k+1. The index of the terminal node represents the number of the cluster. Afterwards, the codebook vectors within the nodes will be adjusted.

Let Mbe the NNT model and let Nbe the set of nodes where∀^Nk ∈ ^{M. From} the rootn1∈ ^{N, the}level orderwill be calculated so that each level in the list contains allnodes for that level. Then for each level the codebook vectors will be adjusted for each node with

ck(s+1)←ck(s) + f(γ)×^α×(D(t)−ck(s))

where f(γ)is the distance functionFfrom the root to the terminal andαis the learning rate. After updating the whole model of NNT, either child that is more similar to the feature vector is chosen and the update is performed again where the root is now the most similar child of the original root. The distance functionF,f(γ) is used because the nodes whose position within the NNT model is far from the root will be updated the most. F spreads the update more effectively to the nodes whose position within the NNT model is closer to the rootn1. Algorithm 1 shows the training process of the model in NNT algorithm. The detailed description of NNT is in PIV.

The trained model of the NNT algorithm can be visualized as if it was a binary tree in two dimensional Euclidean space (Figures 4.12 and 4.13). The nodes on NNT can be colored for the visualization based on the codebook vector values within the nodes to achieve the u-matrix-like presentation of NNT in which the nodes with the most similar codebook vector values are colored with the same color. Because of the NNT training process, the level-ordered sub-trees of a node’s left child are usually colored with more similar colors than the level-ordered sub-trees of a right child of the same node. When the coloring is applied to all nodes, the resulting visualization of the NNT usually can act as a dimensionality reduction method without the loss of generalizability. The colored NNT visualizes the relationships between the different parts of the dataset.

In document Open machine intelligence in education (sivua 44-47)