• Ei tuloksia

Retrievability Simulation

In document Active Faceted Search (sivua 22-26)

We performed retrievability simulations to determine how much of the document space is reachable with different configurations. We would like to see how the differ-ent sidebar configurations compare and how well the active facet assistant augmdiffer-ents them. The algorithm for this simulation is as follows:

1. A list of sidebar facets is generated

2. The active facet component (if applicable) recommends a facet

3. Select the most relevant facet from either the sidebar or active facet compo-nent. If none are relevant, skip to the next facet in the active facet compocompo-nent.

4. Continue steps 2 and 3 until either:

• The selected document is in the top 10 search results

• The document has no more relevant facets

Figure 7: Percent of documents that were retrieved by sidebar configuration. We can see that the addition of the active facet assistant increased the retrievability of documents.

As we can see from the results in Figure 7, the manually curated sidebar had much lower results than the LDA calculated sidebar. This is most likely due to the fact that the manually curated sidebar had a much smaller number of facets to choose from;

with only 25 facets, compared to 53 unique facets in the LDA sidebar. Interestingly enough, when the active facet assistant was introduced, the combination of active

facet and manually curated sidebar slightly outperformed the combination of active facet and LDA generated sidebar. This may be due to the simulation overfitting when selecting from the sidebar.

We also wanted to measure the position of the document in the search results as more facets are selected. Figure 8 shows the results. Again, we can see that the manually curated sidebar does not perform well on its own, but works very nicely when combined with the active facet assistant. In all other configurations aside from manual only, we can see that the majority of the position improvement occurs within the first few iterations. After about 6 iterations, no further improvement occurs.

Figure 8: Position of the document in the search results. A lower position is more likely to be seen by the user. As more facets are selected, the document position improves under all configurations except manual curation.

The active facet component shows a finite number of recommended facets at a time (in this simulation, just one) and allows the user to skip to the next set of facets if none are relevant. For configurations that include the active facet component, we would like to see how many tags are skipped. Figure 9 shows the mean number of tags skipped at each iteration by configuration. We can see that there are a much higher mean number of tags skipped when combined with the manually curated facet sidebar. This reflects the fact that the greater number of facets in the LDA sidebar allows the user to add more context to start with.

As the number of iterations of the simulation increases, the amount of skipped tags in both configurations trends upwards. This is to be expected as each document has a finite set of relevant facets so as more facets are found, the amount of undiscovered relevant facets decreases. That decreases the probability that a the next suggested facet would match an undiscovered relevant facet.

Figure 9: The number of tags skipped in the active faceting component when paired with different sidebar components. More tags are skipped when paired with the sidebar containing manually curated facets.

6 Results

6.1 Task Completion Time

One of the most straightforward measures of a search engine’s success is the time required to find satisfactory results. Although it may not be the most important measure, users generally become frustrated when a task takes too much time.

The differences in task completion time between different configurations did not sig-nificantly differ. Table 3 and Figure 10 demonstrate the time needed to complete each task by system configuration. We can see that adding the active facet compo-nent seems to have slightly increased the time required to complete each task over the baseline of search only.

Configuration Mean Std Min 25% 50% 75% max

Active Only 201.27 176.17 25.65 74.16 133.41 289.54 714.64 Search Only 181.74 132.05 17.53 83.01 136.57 257.31 586.79 LDA + Active 160.37 120.58 26.54 85.14 119.87 191.81 586.09 Manual 176.53 121.20 15.95 89.21 161.37 226.55 600.48 Manual + Active 178.79 107.63 31.25 102.18 134.76 247.97 454.66 LDA 161.17 106.84 43.62 76.33 107.00 242.18 421.92 Table 3: Time needed to complete each task based on system configuration

Figure 10: Time needed to complete each task

In order to determine whether the difference between distributions differ signifi-cantly, we perform the Wilcoxon rank-sum test for each combination of system

configurations. As we can see in Table 4, the distribution of completion times do not differ significantly between system configurations. The largest difference was between ’Manual + Active Faceting’ and ’LDA + Active Faceting’ with a p-value of 0.102.

Active Manual Manual + Active LDA LDA + Active

Search Only 0.453 0.469 0.414 0.270 0.271

Active 0.451 0.355 0.402 0.371

Manual 0.417 0.340 0.205

Manual + Active 0.144 0.102

LDA 0.473

Table 4: P-value results of Wilcoxon–Mann–Whitney test comparing task comple-tion time among various configuracomple-tions

We can see that changing the search configuration has no significant affect on the task completion time. While these results may be disappointing, it shows that an active facet element, sidebar, or any combination can be added to a search system without negatively affecting the system’s performance.

In document Active Faceted Search (sivua 22-26)