Poset-derived Forest Data Structure - Poset-derived Forests

3.3 Poset-derived Forests

3.3.1 Poset-derived Forest Data Structure

The poset-derived forest data structure is used to store filters by their covering property with other filters and is similar to the Siena filters poset presented in the previous section with the exception that each node has only one parent (Figure 3.4). The forest is a generic data structure and

may be extended with the setssubscribers(f),forwards(f),advertisers(a), and forwards(a).

type=A ∧rng[0,100]

type=A ∧rng[0,40] type=A ∧rng[20,90]

type=A ∧rng[30,35] type=A ∧rng[35,38]

type=A ∧rng[0,100]

type=A ∧rng[0,40] type=A ∧rng[20,90]

type=A ∧rng[30,35] type=A ∧rng[35,38]

Filters Poset Poset-derived Forest

type=A ∧equ 35

∧name=B

type=A ∧equ 35

∧name=B

Figure 3.4: Filters poset and poset-derived forest for the same set of filters.

A pair (F,) represents the poset-derived forest, where F is a finite set of filters and is a subset of the covering relation. More formally:

Definition 3.5 A pair (F,) is a poset-derived forest with base setF, if 1. F is a finite set of filters and is a relation between filters in F.

2. For each a ∈ F there is at most one b ∈ F for which b a, i.e., (F,) is a forest with the relation going from parent to child.

3. If a, b∈ F and ba, then bwa.

It is convenient for uniformity of treatment to imagine the roots of the trees belonging to (F,) to be children of a node not in F, which we will call theimaginary root of (F,).

(F,) is called maximal in F if there do not exist a, b∈ F for which (F, ∪{(a, b)}) is a poset-derived forest. It is clear that any poset-derived forest can be extended to a maximal one by adding pairs to the relation . While there may exist several maximal forests for the same base set, Theorem 3.7 shows that it is still meaningful to speak of a “root set”.

The forest can be used to easily compute the minimal cover for F (Corol-lary 3.8). Theorem 3.9 is useful when using the poset-derived forest to detect and compare the overlap of filters. The theorems of Section 3.2.3 are also applicable to forests.

3.3 Poset-derived Forests 35 In applications we typically require the maximality criterion to hold.

The maximality criterion may be generalized to apply at any level of the forest with the following definition.

Definition 3.6 A poset-derived forest (F,) is sibling-pure at node a (a may be the imaginary root) if there do not exist b, c ∈ F for which ab, ac, and either cwb or bwc. The forest is pure if it is sibling-pure at every node, including the imaginary root.

Sibling-purity at a node means that the node’s children in the forest are incomparable with each other. When this holds for every node, the forest is sibling-pure. In other words, a sibling-pure forest ensures that nodes are locally placed as far away from the root nodes as possible. In subsequent examination we assume that the forest is sibling-pure.

Theorem 3.7 The set of roots of the trees in a poset-derived forest maxi-mal inF depends only onF and not on the forest relation. Furthermore, this set is a subset of the set of roots of any poset-derived forest with base setF.

Proof. Let (F,) be a maximal poset-derived forest, and let (F,⁰) be another poset-derived forest having a root a that is not a root in (F,).

Then there exists ab ∈ F for which bw a. These two observations mean that adding (b, a) to ⁰ does not break any of the properties of Defini-tion 3.5, so (F,⁰) is not maximal in F.

For the second part, let a ∈ F be a root in a maximal poset-derived forest. As above, since the forest is maximal, there cannot exist ab∈ F for which bwa. Hence amust be a root of a tree in any poset-derived forest with base setF. 2

A cover for a set of filtersF is defined to be a setG ⊆ F such that for each f ∈ F there exists a g ∈ G for which g wf. This cover is minimal if it does not contain a proper subset that is also a cover ofF. It is clear thatwcannot hold between two members of a minimal cover.

Corollary 3.8 For any set of filtersF there exists a unique minimal cover.

This minimal cover is the set of root nodes of any maximal poset-derived forest with base set F.

Proof. Any root set of a poset-derived forest with base setF is a cover of F and conversely, for each cover it is possible to construct a poset-derived forest with that cover as its root set. The claim now follows directly from Theorem 3.7. 2

For two sets of filters F,G theoverlap of F with G is defined to be the subset of those elements in G which overlap with some element ofF.

Theorem 3.9 For any sets of filters F,G the overlap of F with G is the same as the overlap of the minimal cover of F with G.

Proof. Obviously the overlap of the minimal cover is contained in the overlap of the full set. Now let g ∈ G belong to the overlap of F with G.

Then there exists anf ∈ F for whichf 'g, orN(f)∩N(g)6=∅. Now there exists an f⁰ in the minimal cover ofF for which f⁰ wf, orN(f)⊆N(f⁰).

From this it follows thatN(f)∩N(g)⊆N(f⁰)∩N(g), and thereforef⁰ 'g and g belongs to the overlap of the minimal cover ofF withG. 2

The two central operations for the poset-derived forest are the addition of new elements to the forest, and deleting existing items from it. The operations are presented in Algorithm 3.

Sibling-purity is very easy to maintain for the addoperation, but more complicated for the deloperation. It is expected that for some application areas, such as hierarchical routing or the management of filters from local clients, it is not necessary to maintain sibling-purity for thedel operation.

This simplifies the deloperation in Algorithm 3.

Algorithm 3 assumes that there is an efficient way to find if a filter has already been placed into the structure. This is possible usingsyntactic equivalence using hashtables. In syntactic equivalence, canonical represen-tations of filters are compared. Syntactic equivalence is not necessarily implied by semantic equivalence, e.g.,F1 wF2∧F2 wF1. Semantic equiv-alence is computationally more complex to determine, whereas syntactic equivalence may be achieved in constant or near-constant time and it de-tects all semantically identical filters with simple filtering languages. We note that this restriction to syntactic equivalence does not break the data structure or the routing algorithms. Filters that fail the equivalence testing will be simply placed into the structure.

In document Efficient Content-based Routing, Mobility-aware Topologies, and Temporal Subspace Matching (sivua 43-46)