• Ei tuloksia

In this chapter we presented a discrete model for publish/subscribe mobility support. We examined the cost of pub/sub mobility using three mobility mechanisms and topologies: generic mobility support, acyclic graphs, and rendezvous-based topologies. We also discussed the impact of complete-ness and incompletecomplete-ness of the pub/sub topology on the cost of mobility.

We identified two important optimizations, overlay-based routing and ren-dezvous points. The generic mechanism has a high cost for mobility. The other two mobility mechanisms have a considerably smaller cost.

Mobility-safety cannot be guaranteed if protocols engineered with the completeness assumption are used for incomplete topologies. Based on both the theoretical model and the simulation results, we propose three techniques for improving mobility-aware pub/sub systems: overlay-based routing, rendezvous points, and completeness checking.

We presented a pub/sub mobility simulator and experimental results with the protocols and compared the theoretical upper bound costs with average costs from a simulation scenario. Based on the simulation exper-iment, rendezvous points offer significant performance benefits for mobil-ity protocols. The acyclic protocols have lower cost and latency than the generic API ping/pong protocol. The lowest cost and latency were achieved using the complete acyclic protocol with a rendezvous point.

Part IV

Advanced Data Structures and Techniques

107

Chapter 6

DoubleForest for Temporal Subspace Matching

In this chapter we show how the forest data structure may be used to create more advanced structures for matching and comparing various pro-files, such as interest or context profiles. Profiles are defined in a multi-dimensional content space using filters and they support both discrete and interval values. We present the DoubleForest and a graphical browser tool for visualization. The main application areas of the DoubleForest struc-ture are profile and context-based matching and temporal content-based routing. We compare the results to a set-based algorithm.

6.1 Overview

We observed previously that most event and context processing systems do not feature optimized data structures for routing events and support-ing dynamic profile-based queries. However, there are many optimized matchers for static queries. Support for rapid updates is required, because context descriptions and interests may change rapidly when circumstances change. In this chapter we address two central limitations of current event systems: temporal notifications and subspace queries. Temporal notifica-tions (or profiles) have a duration instead of being instantaneous. The subspace-matching model follows the previously presented content-based routing model with the difference that notifications are defined as sub-spaces of the content space instead of as points. We use the term profile instead of notification to highlight the difference of the proposed model compared to the pub/sub model.

We represent both profiles and profile queries using generic filters. Our 109

notion of a filter is derived from event-based systems. The motivation for this is that by leveraging the properties of filters, namelycoveringand over-lapping between them, we can optimize the data structures in a generic, filter-language-independent fashion, and also support more expressive sub-set selection and matching operations. Moreover, there are also techniques for performingfilter merging to remove redundancy from filter sets by mod-ifying them.

We propose that the poset-derived forest is used to store both profiles and queries based on the covering relation. The forest supports frequent updates to the data structure and it has approximately the same matching performance as the filters poset.

DoubleForest is a new data structure for managing profiles and queries based on covering relations. The assumption is that there are covering relations in both sets. This assumption is realistic for many scenarios, for example, covering is useful for geographic queries and profiles. Filter covering may be determined efficiently for simple predicate-based filters [31]

and attribute filters with disjunctions [132]. Algorithms exist for arbitrary conjunctive filters [74], and also conjunctive tree queries [35].

We support two different matching operators for subspace matching, namely covering and overlapping. The proposed data structure combines these features and supports what we calltemporal subspace matching. Match-ing semantics, such asexact,subsume,intersection, anddisjoint have been previously proposed in the context of matchmaking with a Description Logic reasoner [80]. However, we are not aware of any optimized data structures for matching with these operators.

DoubleForest consists of two poset-derived forest data structures that have associated mappings from the elements of one structure to the other.

We have demonstrated that the forest structure supports frequent data element insertions and removals. We present a new optimization technique for the DoubleForest, which involves the determination of upper and lower bounds. The bounds are used to inspect only the boundary — a set that contains the candidate nodes.

We define the profile-matching problem as follows given a set of profiles and a set of queries:

• for a new profile, find the set of associated (matching) queries,

• for a new query, find the set of associated (matching) profiles,

• when removing an existing profile, find the set of associated queries,

• when removing an existing query, find the set of associated profiles.

6.1 Overview 111 Figure 6.1 illustrates how two forests, based on filter setsP and Q, are combined in DoubleForest to support the matching of profiles and queries with mappings between the elements of each forest. The basic idea is to maintain mappings MP Q and MQP in different directions between the structures. MP Q determines the set ofcovering queries given a profile, and similarly,MQP determines the set ofcovered profiles given a query.

a

Figure 6.1: Storing and matching of profiles and queries.

The DoubleForest data structure is a building block for more complex routing and matching structures. The main application areas for the struc-ture are the following:

• matching profiles and queries,

• change detection, i.e., finding the set of changed profiles and queries efficiently,

• content-based routing,

• distributed service directories with continuous queries.

We observe that the DoubleForest structure generates a taxonomy for both profiles and queries based on the covering relation. This automatic taxonomy creation is envisaged to be useful for applications, such as direc-tory services and context or metadata-based direcdirec-tory browsing.

Context-based service directory querying and directory merging was investigated in [48]. This mechanism assumes that context data is repre-sented using multi-dimensional discrete values. Each service category is

represented by a graph in which the leaves are the services and other nodes represent different context classifications. Distribution of the service direc-tories is addressed by specifying a merging algorithm that merges the input directories. This system is similar to our method, but lacks the generality of covering and overlapping relations and the optimizations used by the forest data structures. Our approach also allows subspace queries. The proposed method generates the taxonomy automatically based on covering relations. Hence, the merging of two service directories is trivial using the DoubleForest.

We also note that filters may contain custom predicates, for example, semantic predicates such as the ”IS-A” relation. In this case, the predicate would be introduced into the filtering language with the corresponding ontology. Thus the proposed data structure may be used with ontologies and external taxonomies.