• Ei tuloksia

ACMC concept maps

6.4 Results

6.4.2 ACMC concept maps

Table 6.1 gives a summary of the results obtained after testing the ACMC with the given test data. From the table, it can be seen that ACMC extracted 689 words with

≥3, excluding stop words, from the DM book. Out of the extracted concepts, 148 were compound words and 232 were found to be nouns. Of words with frequency

≥ 10, 181 concepts were extracted, 19 were compound words and 102 nouns. 103 extracted concepts had frequency ≥15, of which 10 were compound words, and 68 were nouns. The ACMC extracted 138 potential relations of which had frequencies


The TFCS data had the ACMC extract 916 words of which 277 were compound con-cepts, 244 nouns. Of all the extracted concon-cepts, 251 extracted word, 30 compound words, and 97 nouns had frequency of ≥10. 191 extracted concepts, 15 compound words and 67 nouns had frequency of ≥ 15. The ACMC extracted 498 potential relations, with frequency ≥7 of which 19 were found to be sensible relations.

6.4. RESULTS 43

Table 6.1: Concepts extracted by the ACMC.

896 potential concepts were extracted when tested with the Scientific Writing mate-rial. Of these extracted concepts, 147 were compound words, 321 were nouns. 236 of the 897 extracted concepts, 5 were were compound words and 121 were nouns.

with frequency ≥10. 161 extracted concepts, 1 compound word and 72 nouns had frequency ≥15. 73 relations were extracted.

It was observed that more sensible single word concepts, compound concepts and relations had a higher frequencies. The highlighted rows in table 6.2 shows the most frequent sensible and non-sensible concepts and relations.

Comparison to Human drawn maps

A comparison between the manually constructed and ACMC constructed concepts maps was made, and the the results summarized in table 6.3. 32 concepts of which 14 were compound concepts, 24 nouns and no relations were found to appear in both the manually constructed concept map and the ACMC constructed concept maps, from the DM book test data. TFCS test data produced 31 concepts, of which 13 were compound concepts, 15 nouns and no relations in both the manually and ACMC constructed concept maps. Of the 62 concepts in both the manually and

Table 6.2: List of most sensible and non sensible extracted words.

6.4. RESULTS 45

Table 6.3: Concepts, compound concepts, nouns and relations that appear in both manual and ACMC constructed maps.

ACMC constructed concept maps, 13 were compound words, 32 to be nouns and no potential relations.

Precision, Recall and F-Measure

Precision, recall and F-measure of extracted concepts, compound words and re-lations for each test data was calculated at different thresholds. The results are presented in table 6.4

From the table represented in table 6.4, it can generally be observed that the pre-cision is lower than recall, with a few exceptions. The exceptions occurred when calculating precision and recall of compound words with frequency of ≥ 7, apart from the TFCS test data, where the precision was higher with compound words of frequency of ≥ 10. It can also be observed, in most cases, that the precision of extracted concepts increased while the recall decreased, as the frequency threshold increased. A similar trend was observed with compound words and nouns.

The highest values of precision and recall were observed in within different thresh-olds, and different components(extracted concepts, compound words, nouns and relations) tested.

Generally, SW has the highest precision (1) at compound words of frequency ≥15 followed by 0.6 at compound words of frequency ≥ 10 of the same test data, and DM (0.5) at compound words of ≥15. DM has highest recall (0,6167) at extracted concepts of frequency ≥ 3 followed by TFCS (0.5818) at nouns of frequency ≥ 3 and DM (0.4211) at compound words of frequency ≥3.

The highest precision values of different components were observed at frequency of


6.4. RESULTS 47

≥15, with relations being and exception. The highest recall values of all components tested were observed at frequency of ≥3.

For two of the test data, DM and TFCS, it can be seen that there was a tendency for the F-measure to increase with the increase in the frequency threshold for Extracted word and Compound words, except for the highest threshold, where the F-measure dropped. The F-measure decreased with the increase of the frequency threshold for the SW test data. The F-measure decreased with the increase of the threshold for Relations for all the test data. The highest F-measure (0.2857) was observed for the Compound words DM test data.

The frequency threshold affected the results in that, as the threshold increased, more relevant concepts were extracted as the number of retrieved concepts decreased.

The results depict better performance of the ACMC in retrieving compound con-cepts from the SW test data and extracting relevant concon-cepts from DM test data.

Low precision and recall values observed in the relations component show that ACMC did not perform well in extracting relevant relations from the test data.

From the observed measures above, we can conclude that ACMC works better with DM test data than with TFCS or SW test data. This conclusion is based on the observation that DM test data has a better precision/recall combination and F-measure. DM test data produced reasonably high precision and recall values despite the values falling within different thresholds. SW test data produced the highest precision value (1), but did not offer a suitable recall combination. At this point, the best frequency threshold to be used in ACMC cannot be determined as the highest precision and recall values fall in different frequency thresholds.

Comparison to Leximancer

Each of the test data was run though Leximancer and the results can be seen from Figure 6.4. The Leximancer extracted 116 from the DM book. 154 concepts were extracted from the TFCS test data and 104 concepts were extracted from the Scientific Writing test data. See appendices A, B and C for results obtained from


Figure 6.4: Concepts extracted from Leximancer.

Figure 6.4 summarizes the comparisons made between the manual concept maps and the Leximancer concept maps. It was observed that 18 concepts from the DM book appeared in both the manual concept maps and the Leximancer concept maps. The same number of concepts was observed in the TFSC material. 19 concept appeared in the manually and Leximancer constructed concepts maps for the Scientific Writing material.

Comparing the ACMC and Leximancer constructed concept maps led to the values displayed in Figure 6.4. DM book and TFCS test data had 85 and 101 concepts respectively appear in both the Leximance and ACMC extracted concepts .101 con-cepts from the Scientific Writing test data appear in the ACMC and the Leximancer constructed concept maps.

Leximancer can be seen more of a graphical tool rather than statistical, therefore comparisons of the relations extracted by the Leximancer could not be compared to the relations extracted by ACMC or human maps.