• Ei tuloksia

Dialogue system based on fuzzy logic and word embeddings

N/A
N/A
Info
Lataa
Protected

Academic year: 2023

Jaa "Dialogue system based on fuzzy logic and word embeddings"

Copied!
32
0
0

Kokoteksti

(1)

Turku University of Applied Sciences Thesis | Sara Jose Roig Bachelor’s thesis

Information and Communications Technology 2022

Sara Jose Roig

Dialogue system based on fuzzy

logic and word embeddings

(2)

Turku University of Applied Sciences Thesis | Sara Jose Roig Bachelor’s Thesis | Abstract

Turku University of Applied Sciences

Information and Communications Technology 2022 | 32

Sara Jose Roig

Dialogue system based on fuzzy logic and word embeddings

The main purpose of this thesis was to create the proof of concept of a dialogue system of a videogame. The system would consist of a dialogue graph, a word search to allow the player to interact with a non-player character and some fuzzy logic variables to control the state of the character.

The system was created using the game engine Unity with an Ink-unity integration plugin, the Inky editor and some external C# libraries. It assesses the feasibility of using word embeddings, obtained with a pretrained unsupervised learning algorithm; and semantic similarity, obtained through cosine similarity, to match words in a word search.

The results showed a low suitability of the word embeddings model and semantic similarity used due to having a low precision for the required task. In contrast, the dialogue graph and the fuzzy logic state variables have been shown to achieve the desired outcome.

Keywords:

Unity, word embedding, cosine similarity, GloVe algorithm, fuzzy logic

(3)

Turku University of Applied Sciences Thesis | Sara Jose Roig

Contents

List of abbreviations 5

1 Introduction 6

2 UI and System Design 8

2.1 UI Design 8

2.2 External libraries 9

2.3 Ink usability in Unity 10

2.4 Graph and structure 10

3 Fuzzy logic variables 12

3.1 Theory and the need for fuzzy sets 12

3.2 Fuzzy sets and membership functions 14

3.3 Fuzzy set of rules 16

4 Word search 21

4.1 Word Embeddings with GloVe 22

4.2 Comparing GloVe to other embeddings that could be used 23

4.3 Word and sentence cleaning methods 24

4.4 Word comparison: cosine similarity 24

4.5 Other types of distances 25

5 Methodology 27

6 Discussion 28

7 Conclusion 30

References 32

Equations

Equation 1. Elements of a linguistic variable. 12

(4)

Turku University of Applied Sciences Thesis | Sara Jose Roig

Equation 2. Definition of a linguistic variable. 13

Equation 3. GloVe cost function. 22

Equation 4. Definition of cosine similarity. 24

Equation 5. Definition of Euclidean distance. 25

Equation 6. Definition of Manhattan distance. 25

Figures

Figure 1. UI with text dialogue. 8

Figure 2. UI with word search field. 9

Figure 3. UI with multi-option field. 9

Figure 4. Graph diagram. 11

Figure 5. The fuzzy logic control flow. 13

Figure 6. Plutchik wheel. 14

Figure 7. Diagram of the fuzzy subset. 15

Figure 8. Stability with a value of 53. 18

Figure 9. Previous trust with a value of 20. 18

Figure 10. Happiness with a value of 35. 19

Figure 11. Area of the fuzzy answer. 20

Tables

Table 1. Set of rules. 16

(5)

Turku University of Applied Sciences Thesis | Sara Jose Roig

List of abbreviations (or)

AI Artificial Intelligence.

BERT Bidirectional Encoder Representations from Transformers.

CoG Center of Gravity.

GloVe Global Vectors.

FL Fuzzy Logic.

FLS Fuzzy Logic Sharp.

JSON JavaScript Object Notation.

LGPL Lesser General Public License.

MIT Massachusetts Institute of Technology.

ONNX Open Neural Network Exchange.

NLP Natural Language Processing.

NPC Non-player Character.

UI User Interface.

XML Extensible Markup Language.

(6)

Turku University of Applied Sciences Thesis | Sara Jose Roig

1 Introduction

In the current word of videogames, non-player characters (NPCs) have a predetermined set of answers where the player can choose from and which are predictable and quite unnatural which makes every interaction break the immersion in the videogame. Besides, most of NPCs do not show any type of emotional responsiveness and have a fixed behavior which reinforces the feeling that there are throwing data at you instead of mimicking a real conversation. This thesis aims to find different ways to address this problem and to create unique experiences for the player.

Over the last few decades, video game industry has grown quickly from a small market to a massive industry (Osathanunkul 2015). Despite this rapid growth, there are several possible remedies to this problem that have not been explored.

Some innovative and interesting methods have been tried. Some games have used dialogue systems and methods of storytelling which did not used a decision tree with multiple options. Inspiration has been taken from dialogue system such as the ones in Event[0], Her Story and Scribblenauts.

The word of NLP has also expanded exponentially in these last years. New techniques and algorithms have emerged and their practical applications have expanded in fields such as text recognition, translation or extraction of meaning.

However, the potential it could have in videogames could be further exploited.

This thesis in structured in three parts. The first one (Chapter 2), explains de UI design and the graph structure of the dialogue. The second part ( Chapters 3), consists of a system which allows the NPC change emotional states and determines the trust it has with the player. This is accomplished using a fuzzy logic system which is inspired by a personality engine called ExtremeAI, a psychology-based personality engine that creates adaptive NPC personalities (Georgeson and Child 2016).

The third part (Chapter 4), researches and tests a word search in a set of questions inside a dialogue. The aim is not only to give exact matches but also

(7)

Turku University of Applied Sciences Thesis | Sara Jose Roig

words with some degree of semantic similarity. That would make the player think more about what to ask to the NPC instead of being a predefined line of dialogue.

This is accomplished by using word embeddings which uses a pretrained NLP model of unsupervised learning. Afterwards, these embeddings are compared to determine their similarity.

This project was implemented in Unity Engine and with some external .NET libraries. The research was carried out by studying scientific papers which describe some suitable methods and algorithms that could be used for the desired outcome.

The final goal of the thesis is to demonstrate that the feasibility of the system described and the consideration of its performance to be applied in a practical system included in a videogame beyond a proof of concept.

(8)

Turku University of Applied Sciences Thesis | Sara Jose Roig

2 UI and System Design

The main means of interaction with an NPC is through a UI that displays the text and lets the player select options. This chapter explains the elements of such UI, the libraries used for the system, and the editor and plugin used to create the dialogue graph.

2.1 UI Design

The UI design was created to be simple and consistent as the main focus of this thesis is the technical aspects of the dialogue. For that reason, it consists of the following elements:

● A simple box of text which fills in dynamically when the lines of dialogue are obtained from the graph as shown in Figure 1.

● An input field where the player can search for questions using words as shown in Figure 2.

● Multi-option choices are given to the player as possible replies to the NPCs as shown in Figure 3.

Figure 1. UI with text dialogue.

(9)

Turku University of Applied Sciences Thesis | Sara Jose Roig Figure 2. UI with word search field.

Figure 3. UI with multi-option field.

2.2 External libraries

The UI was created with using the game engine Unity, alongside the following external .NET libraries: Microsoft.ML for word embeddings; Ink-unity integration

(10)

Turku University of Applied Sciences Thesis | Sara Jose Roig

to export and manipulate the Ink dialogue; Inflector for singularizing; Fuzzy Logic Sharp (FLS) for Fuzzy Logic; and Accord.NET for semantic similarity calculations.

2.3 Ink usability in Unity

Ink is a core narrative engine used to create dialogue systems and narratives for games. The Ink-unity integration plugin was installed as a plugin through Unity Asset Store and the files were edited using Inky Editor. The system was desiged in a way that the dialogue text can be accessed from outside Unity. A graph is dynamically created when information is being edited in an ink file and it can be uploaded to a unity project.

Once the graph has been saved, it will be automatically compiled in JavaScript Object Notation (JSON) using the Ink-unity integration plugin. Then, the runtime ink engine will be able to load the story and will be accessible in the code. To move forward in the story, it will check if the story can continue and will display the dialogue or choices.

2.4 Graph and structure

For this project and for the purpose of demonstration, the following graph structure has been created as shown in the diagram of Figure 4.

(11)

Turku University of Applied Sciences Thesis | Sara Jose Roig Figure 4: Graph diagram.

The dialogue has an interrogation style and is created in a loop structure.

However, it is not constrained in said structure and can be adapted to the necessities of the dialogue. The graph is formed by a main node, which contains a list with all the questions that can be asked to the NPC. Each question contains a tree with a variable number of levels. The number may vary but usually between 1 to 4. To keep it simple, for now most of them will only have 1 answer. Those levels can contain multi-option choices and their child nodes are the possible replies of the NPC to the player.

To decide the path inside this subtree, 3 global variables have been created.

Besides, it also can keep a record of other nodes that are visited. This information is used together with the choices of the player to display an answer, which will determine how much of an honest answer is given to the player. Once the player reaches any of the leave nodes, it redirects to the list of questions.

(12)

Turku University of Applied Sciences Thesis | Sara Jose Roig

3 Fuzzy logic variables

3.1 Theory and the need of fuzzy sets

Fuzzy logic was firstly introduced by Lotfi Zadeh in 1965 and, as defined in the Stanford encyclopedia of Philosophy, is intended to model logical reasoning with vague or imprecise statements (Cintula et al. 2021). It functions with the principles of classical logic, with the main difference being that instead of having only 2 logical expressions, they allow for a bigger set of truth degrees.

A few variables are created to modulate the level of trust the NPC has towards the player. Depending on those variables, an answer will be chosen from the graph. However, it is difficult to determine what it means to be happy, sad or trust someone, as they are abstract concepts. A system with fuzzy sets was chosen over classical sets due to its level of approximation for the function defining the emotional state of the NPC.

The level of approximation used in a decision tree using discrete values will give a bigger approximation and a less exact function than one created with fuzzy logic values. This is because each discrete value will only return a true or false, whereas each value in fuzzy logic system has a function defining it. In that way it can determine, for example, the level of sadness of an NPC, and not only if it is sad or not as it would need to be specified in a decision tree with discrete values.

For that very reason, linguistic variables have been implemented. Those are variables whose whole values are not numbers but words of sentences in a

natural or artificial language (Zadeh 1973).

Each linguistic variable A is formed by the following elements:

𝐴 = < 𝑥, 𝑇(𝑥), 𝑋, 𝐺, 𝑀 >

Equation 1: Elements of a linguistic variable.

X being a variable name, T(x) being a set of terms of x, (U or X) being a universe of discourse, G being a set of syntax rules and M being a set of semantic rules.

(13)

Turku University of Applied Sciences Thesis | Sara Jose Roig

To the definition of a set the membership function 𝛾𝑎 is added. Any value x needs a value between 0 and 1 to be assigned. In that way, a degree of membership of each value can be specified to the variable.

𝐴 = { (𝑥 , 𝛾𝑎(𝑥)) | 𝑥 ∈ 𝑋 , 𝛾𝑎(𝑥) ∈ [0, 1] }

Equation 2: Definition of a linguistic variable.

The membership function 𝛾𝑎 can be modified to change values of the set. In that way, the range of values of x can be changed as they are decided depending on the needs of the system.

The syntax rules are used to define the set of terms created for the linguistic variable. The semantic rules are used to transform the set of terms to measurable values. In section 3.2, all the mentioned elements are specified for each linguistic variable used in the system.

The fuzzy set pipeline can be seen in Figure 4 and is defined as:

● Fuzzification: Define the input fuzzy sets and the membership functions of each one. For all the values are defined encodings as vectors.

● Fuzzy Inference: Create a rule based to apply to the fuzzy sets.

● Defuzzification: Outputs a combined value for the output variable.

Figure 5: The fuzzy logic controller flow.

(14)

Turku University of Applied Sciences Thesis | Sara Jose Roig 3.2 Fuzzy sets and membership functions

The fuzzy logic model needs an input of 3 float numbers (numeric values with floating decimal points) and returns 3 new values which determine the current state of the NPC. The input values are obtained from saved global variables in the Ink dialogue graph.

Afterwards, the fuzzy logic model is called. Once the inference and defuzzification of the model is done, the new values are saved in the global variables of the graph. The answer of the NPC is determined by the sentence with the highest membership. The updated scalar values are used to determine the answer of the character by choosing the answer assigned to the value with the highest membership. Consequently, this will determine what path from the graph is chosen.

In the first instance, a Plutchik wheel was taken as a reference and inspiration with 4 variables and 3 intensity levels was chosen. The four input variables for the fuzzy sets being ecstasy-grief, admiration-loathing, vigilance-amazement, rage-terror as seen in Figure 6.

Figure 6: Plutchik wheel.

(15)

Turku University of Applied Sciences Thesis | Sara Jose Roig

However, it was discarded as there were too many states to consider; a simpler system was needed. A simplified version was finally chosen to have a better control of the output; more input variables may be included if there is a need for a wider range of output states.

The final linguistic variables with their respective set of terms for the fuzzy subsets are defined as: happiness (joy, neutral, sad) and stability (fear, calm and anger).

The output variable was defined as trust level (trust, indifference and disgust).

The tendency of a specific NPC to become happy or sad faster or the level of stability can be changed by changing the threshold values from each fuzzy subset. For example, if an NPC has a bigger tendency of becoming angry, the threshold from calm to anger could be changed to a lower value.

Figure 7: Diagram of the fuzzy subset.

As seen in Figure 7, if the linguistic variable content has a value of 37.5 for example. Then happiness = [0.5, 0.5, 0].

The functions needed to be normalized, in a symmetrical manner and the sum of all the values should add up to 1. In that way, the output results are human readable and easy to interpret.

(16)

Turku University of Applied Sciences Thesis | Sara Jose Roig

Because there are no criteria for choosing the shape of the membership functions and it depends on each specific case, two different types of shapes were tested:

triangular and gaussian. Both functions were effective; therefore, the triangular function was chosen over the gaussian as the results are computationally less expensive and have smaller runtimes.

3.3 Fuzzy set of rules

Once the set has been defined a set of rules are defined using the principles of propositional logic. These sets of rules need to be determined manually as it is not possible to determine clustering methods or algorithms.

For that a set of modus Ponens arguments and a set of rules were created defined as 𝐴 ∩ 𝐵 ∩ 𝐶 → 𝐶′ where A, B, C and C’ are fuzzy sets. The following table presents some possible experimental rules, which excluded the ones that are considered not necessary.

(17)

Turku University of Applied Sciences Thesis | Sara Jose Roig Table 1: Set of rules.

Happiness Stability Previous trust level Trust level

IF

joy Fear OR anger trust OR indifference indifference

joy Fear OR anger disgust disgust

Joy OR neutral calm trust OR indifference trust

Joy OR neutral calm disgust indifference

neutral fear trust OR indifference indifference

neutral AND fear AND disgust THEN disgust

neutral anger trust trust

neutral anger Indifference OR

disgust

disgust

sad calm trust trust

sad calm indifference OR

disgust

indifference

sad Anger OR fear n/a disgust

The number of maximum rules will be the number of permutations for the model, which is each fuzzy subset per the number variable, in this case 33. The number of rules is low enough for this model; therefore, all the possibilities have been covered.

However, if the number of variables increased, some permutations would not be considered. Moreover, for each calculation not all rules will fire so they will not be used. These reasons will make the model less computationally expensive. The defuzzification is handled by the C# FLS library by creating a fuzzy engine and applying a defuzzifier class without any further steps.

If it had to be done manually, the process to calculate the output variable value would be the following. Considering only a set of examples, input values of happiness = 35, stability = 53, and previous trust level = 20, we would like to find out the current trust level.

(18)

Turku University of Applied Sciences Thesis | Sara Jose Roig

In that case, happiness would be considered between joy and neutral, stability would be mostly calm but a bit of anger, and the previous trust level would be trust as seen marked in a black line in Figures 8, 9 and 10.

Figure 8: Stability with a value of 53.

Figure 9: Previous trust with a value of 20.

(19)

Turku University of Applied Sciences Thesis | Sara Jose Roig Figure 10: Happiness with a value of 35.

In this instance, more than one rule will fire. However, the calculations of only one will be shown for exemplification purposes: If Neutral AND anger AND trust THEN trust. The rest of the rules may be looked up in table 1.

As seen in the previous graph, the value for neutral is 0.4, the value for anger is 0.12 and the value of trust is 0.8 approximately marked in red. In the case of AND join, the minimum value will be taken, in case of OR it will be the maximum one.

Then Neutral AND anger AND trust would be written as the following:

𝐿𝑒𝑣𝑒𝑙 𝑜𝑓 𝑡𝑟𝑢𝑠𝑡 = (0.4, 0.12, 0.8) = 0.12

Therefore, it is determined that the current level of trust is almost fully trust as can be seen in the function in Figure 9. The answer is the trapezoid region underneath the value 0.12 in the fuzzy subset of trust as can be seen in Figure 11.

(20)

Turku University of Applied Sciences Thesis | Sara Jose Roig Figure 11: Area of the fuzzy answer.

The default inference engine from the library FLS is used, which is the center of gravity (CoG) engine. It converts the fuzzy answer to a scalar. In this case the CoG trapezoid formula was used. The same procedure would be followed iteratively by all the rules.

(21)

Turku University of Applied Sciences Thesis | Sara Jose Roig

4 Word search

As explained in section 2.4, the user inputs a word and the game finds questions that contain such word or similar ones. The aim is to not only show words that are the same but to show sentences with words with similar semantic meaning.

Using a dictionary of synonyms, the only words that would show as a match when there is an input are those that are exact synonyms. However, using an NLP model not only the words that are an exact synonym are shown but also the ones whose similarity is close. For example, if the user searches for the word “witness”

then it may also show sentences containing the words “person”, “declaration” or

“proof” as well.

Questions need to be split into words and uniformized to make the search easier before the game is built. Plural words are singularized. Stop words, which are commonly used words of the English language that do not add much meaning such as “should”, “at” or “the”; are removed from the sentences as they are not desired for the player to find sentences using them. The stop words dictionary is editable in that way there is full control on what words are removed from the sentences. Finally, the words are embedded into vectors, serialized into an XML format (which is the process of converting an object into a stream of bytes) and saved in the project. In that way, there is no need to embed (which is the process of changing a word to a vector of discrete values) and serialize those words for every search as it would be extremely inefficient. Before the player searches any word, the data is deserialized and saved in a collection type for the runtime.

A similar method is also used when the user inputs a word. First, it verifies its input, then it is embedded into vectors using the GloVe algorithm and finally it is compared to the words in the deserialized data.

Sentences are saved in a list and identified and referred to by their index. Every time there is a match with a particular word the index is saved in a list. Once all the possible matches have been found, this list of indexes is used to display the questions in the UI.

(22)

Turku University of Applied Sciences Thesis | Sara Jose Roig 4.1 Word Embeddings with GloVe

Global vectors for word representation (GloVe) is an unsupervised learning algorithm for obtaining vector representations for words. (Pennington et al. 2014).

These vector representations are obtained from the statistics of word occurrences in the training data. The model chooses two words which are in close proximity in a symmetric relationship in the sentence and counts the number of times a word i appears in word j.

Its objective is to extract semantic relationships between different word embeddings. Therefore, this method is built not only using word probabilities but co-occurrence probabilities. The values are weighted differently depending on their contextual distance instead of straight away their probabilities. In that way, words that are far away have less weight than words that are closer.

Some words will tend to have a high level of co-occurrences whilst others will have none. Applying the Log function to the matrix 𝑋𝑖𝑗 fixes this problem and makes the values more uniformly distributed.

GloVe minimizes the square cost function to learn meaningful word embeddings.

The cost function will be defined as follows. Where 𝑋𝑖𝑗 is the weighting term that will be 0 if 𝑓(𝑥𝑖𝑗) = 0. The weighting function 𝑓(𝑥𝑖𝑗) also regulates the weight of frequent and infrequent words. The dot product 𝑤𝑖𝑡𝑤𝑗 are the inputs and, 𝑏𝑖 and 𝑏𝑗 are the bias terms.

𝐽 = ∑ 𝑓(𝑋𝑖𝑗)(𝑤𝑖𝑡𝑤𝑗+ 𝑏𝑖+ 𝑏𝑗− 𝑙𝑜𝑔𝑋𝑖𝑗)2

𝑉

𝑖,𝑗=1

Equation 3: GloVe cost function (Pennington et al. 2014).

(23)

Turku University of Applied Sciences Thesis | Sara Jose Roig 4.2 Compare GloVe to other embeddings that could be used

The GloVe algorithm was implemented in the project instead of a custom-made model to make it for as a generic topic as possible. Even though a custom-made model could be more tailored to the problem, they are extremely time consuming to train and would need vast quantities of topic specific data, which is not available, nor is it possible for this project scope to create. Besides, using a custom-made model would increase the difficulty of the tuning of the model hyperparameters and it would not guarantee a better result than the generic model used.

Other pre-trained models were considered as well, such as FastText, BERT and WordNet. Initially, a Bidirectional Encoder Representations from Transformers (BERT), an NLP model created by researchers at Google AI Language, was supposed to be trained in python and then converted to Open Neural Network Exchange (ONNX) format, saved in the unity project and accessed using Barracuda, a lightweight package used for neural network inference in Unity.

However, that method presented technical problems during importation.

Therefore, the method was discarded. However, it is a model to strongly consider in the future if the input is a full sentence instead of the current system that only allows a word at a time.

It was also considered to create a model with the library Unity-ML Agents Toolkit which is open-source project that enables games and simulations to serve as environments for training intelligent agents (GitHub - Unity-Technologies 2021).

However, that included creating a custom model and train it inside Unity which it is suboptimal for the reasons explained at the beginning of this section.

(24)

Turku University of Applied Sciences Thesis | Sara Jose Roig 4.3 Word and sentence cleaning methods

When the word is input. First, it is checked that it is a valid input as the system, at least at the moment, only accepts one word per search. Moreover, using regular expressions (regex) it is checked that there are no numbers or special characters in the word.

Then, the word is changed to lowercase and changed to singular if the input was initially plural using the Inflector library. A lemmatization, which is a process that gets a root word from a given word considering its morphological analysis, would be preferable. In that way, words such as “propose”, “proposition” and “proposing”

for example, would be changed to the root “propose”. However, none of the .NET libraries found were fit for this task.

4.4 Word comparison: cosine similarity

Once the word has been embedded, it is compared using cosine similarity, which is used to determine the relatedness between two words. Cosine similarity measures the similarity between two vectors of an inner product space (Han et al. 2012).

Applying a cosine distance, the degree between the two embedded vectors was obtained. Cosine similarity was used instead of cosine distances to make it more human readable. The result will be 1 if the embedding is the same, therefore it is the same word, and 0 if the vector is completely different. Cosine similarity is defined as the dot product of the vectors divided by the product of their vector magnitudes, where x and y are the two vectors to compare.

𝑐𝑜𝑠 (𝜃) = 𝑥 · 𝑦

||𝑥|| ||𝑦||

Equation 4: Definition of cosine similarity.

If the cosine similarity is bigger than a threshold, the question containing these words is shown. If the cosine similarity is 1 it means it is the same word

(25)

Turku University of Applied Sciences Thesis | Sara Jose Roig

embedding, therefore the same word. However, to be more cost efficient, first it checks if the word is the same comparing their strings. If so, no further assessment is needed, and that question is shown. If not, the word embeddings are compared.

To define the threshold, the uncertainty of similarity between different pairs of words was looked at. Some of the words were compared and a threshold between a range of 0.4-0.6 was decided for them after trial and error. If the words are changed in the future a new threshold needs to be decided.

4.5 Other types of distances

Other types of similarities that were considered were the following:

• Euclidean distance is the Euclidean space between two vectors. And is defined by the following formula where x and y are the two vectors.

𝑑(𝑥, 𝑦) = √∑(𝑥𝑖− 𝑦𝑖)2

𝑛

𝑖=1

Equation 5: Definition of Eucliden distance.

However, it is not necessary to use this one particularly as the magnitude of the vectors are not important, the number range could be way bigger and less intuitive to understand for the human reader.

• Manhattan distance which is defined as the absolute of the distances between two vectors

𝑑(𝑥, 𝑦) = 𝑎𝑏𝑠(∑(𝑥𝑖 − 𝑦𝑖)

𝑛

𝑖=1

)

Equation 6: Definition of Manhattan distance.

Manhattan distances are preferred to Euclidean distances in high dimensional data (Aggarwal et al. 2001). However, it is not so commonly used as the Euclidean or cosine distance for this specific task.

(26)

Turku University of Applied Sciences Thesis | Sara Jose Roig

These distances could be used to check distances between word embeddings;

however, as tested in the dataset with some example words, neither of them was shown to be more efficient nor present any benefits compared to cosine similarity and were, therefore, discarded.

(27)

Turku University of Applied Sciences Thesis | Sara Jose Roig

5 Methodology

The necessary information was obtained through a documentary analysis of scientific literature and official documentation. Its aim was to test the researched methods and see their performance for this specific case.

A first method for the word search consisted of a system that matched search words with the same exact word in the database. However, that method was discarded as it was too limiting.

A second method which searched for synonyms in a synonym dictionary was proposed. However it required to load a whole dictionary where most of words would not be used and was quite limiting in the number of matches.

For that reason, a system with a wider number of matches was search. GloVe was chosen over other models. It is good candidate because their words embeddings have been extensively tested and used in different types of fields and projects. Besides, it had an easy implementation and was ready to use with the library Microsoft.ML.

Regarding the state variables, a system with a few conditionals statements was tested. However, the system was changed to expand the results not only to a crisp value but to a wider range of possibilities. The membership value of a fuzzy set was decided to be used instead to examine if the results obtained were feasible, and could be utilized for a finished product.

Finally, Ink was chosen over other dialogue systems due to its facility of usage and versatility. Ink allows you to add not only dialogue text but also embed some code directly to the JSON such as variables, logic statements and functions.

Additionally, Ink can be edited and tested outside the game engine, which avoided a future possible bugs in the project.

Therefore, the research lead me to the methods finally used: Ink as a scripting language, an NLP unsupervised learning model with word embeddings and a system with fuzzy logic variables.

(28)

Turku University of Applied Sciences Thesis | Sara Jose Roig

6 Discussion

The final aim was to assess the viability of word embeddings and semantic similarity methods, the fuzzy logic system, and the technologies for the dialogue system in a videogame with the desired characteristics. The following findings were found.

Ink is a very suitable option to use for this specific type of dialogue. It is easy to use and versatile, and it is not heavy on the system. It has a really fast learning curve and it allows to include decision trees, variables, functions and text all in the same place which is really convenient for the required task.

The fuzzy logic system was hard to defined as it did not dispose of any ground truth from were setting the rules. The rules are a free interpretation of different emotional states instead of a universal truth rules. Therefore, there is no proof they are in the most efficient way and the best suited for the task. However, the system does return valid numbers that are consistent with the expected results.

All the libraries used have either MIT or Apache 2.0 licences. Therefore, they open source and could be used for commercial use and distribution if needed.

However, it is not the case for the library Accord.NET which has a LGPL license and even though it is open source could present problems when used in a commercial product.

The word embedding works correctly and because most of the words are embedded before runtime and serialized, it is not too computationally intensive to the system. The threshold for the cosine similarity was set at 0.4 after comparing a list of cosine similarity and Euclidean distances results of the most probable search words for the set of questions used.

The higher this number is the less permissive is with the matches of words, and if set too high, it does not even accept words that are really close such synonyms.

Therefore it was set in a rather low number. However, this leaded to some questions which did not have that much of a semantic similarity being accepted when it is not desired.

(29)

Turku University of Applied Sciences Thesis | Sara Jose Roig

In closing, the results demonstrated that system cosine similarity and the word embedding would need to be changed to be able to use this system in production.

(30)

Turku University of Applied Sciences Thesis | Sara Jose Roig

7 Conclusion

The initial goal of this thesis was to create a dialogue with a word search system which allowed the player to communicate with an NPC. A dialogue system was created in Unity using fuzzy logic, a GloVe algorithm and cosine similarity. Some findings and conclusions have been reached with this work.

The whole system is well structured, and the GloVe algorithm does work as intended. However, due to the generic nature of such, the embeddings resulted to be too generic to make a substantial comparation between words. Further research should be carried out on more suitable word embedding algorithms which are more customized to the problem.

A custom-made model should be considered as it would be more specific for this database. If not possible, some further fine-tuning of the model should be tried at least. Moreover, a BERT algorithm should also be considered as it allows for more complex word and sentence comparisons. For that, a reliable form of uploading these other methods to the game engine should be found.

The current model takes a few seconds in the first load of the game due to the loading of the context class in Microsoft.ML library, which is essential to create components for the model. Even though the embeddings are implemented relatively quickly, other more cost-efficient models should be considered, especially if the system needs to be integrated to a larger game.

The model had a high level of uncertainty when accepting or discarding words using cosine similarity between word embeddings, this is because the model only considers similarity and not relatedness which leave a great, randomized factor of which embeddings are considered a match and which do not. A new different method shall be considered in the future to check word similarity.

A few future updates could include more variables for each question; for example, keeping track of what objects the player has interacted before and only showing questions of objects that have been found or also, using the list of visited questions, some questions may only be shown when others are already visited.

(31)

Turku University of Applied Sciences Thesis | Sara Jose Roig

Those questions could contain more specific information and could even create different endings for the dialogue.

Additionally, more linguistic variables to the fuzzy logic system may be included,for example, variables that indicate the sensitivity of a question or enlarge the range of current linguistic variables.

Therefore, it is suggested that the dialogue graph and the fuzzy logic variables be used for production but not the word embeddings and word search because, as previously mentioned, it is not adequately precise for the task.

(32)

Turku University of Applied Sciences Thesis | Sara Jose Roig

References

Aggarwal, C.C., Hinneburg, A. and Keim, D.A., 2001. On the surprising behavior of distance metrics in high dimensional space. In International conference on database theory (pp. 420-434). Springer, Berlin, Heidelberg.

Cintula, P., Fermüller, C. and Noguera, C., 2021. Fuzzy Logic (The Stanford Encyclopedia of Philosophy). Plato.stanford.edu. [Referred on 26 March 2022]

(online). Available at: https://plato.stanford.edu/entries/logic-fuzzy/

Georgeson, J. and Child, C., 2016. NPCs as People, Too: The Extreme AI Personality Engine

GitHub. 2021. GitHub - Unity-Technologies/ml-agents: Unity Machine Learning Agents Toolkit. [Referred on 2 Febrary 2022] (online). Available at:

https://github.com/Unity-Technologies/ml-agents

Han, J., Kamber, M. and Pei, J., 2012. Data mining. 3rd ed. Amsterdam:

Elsevier/Morgan Kaufmann.

Osathanunkul, C., 2015. A classification of business models in video game industry. International Journal of Management Cases, 17(1), pp.35-44.

Pennington, J., Socher, R. and Manning, C.D., 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).

Zadeh, L., 1973. Outline of a New Approach to the Analysis of Complex Systems and Decision Processes. IEEE Transactions on Systems, Man, and Cybernetics, SMC-3(1), pp.28-44.

Viittaukset

LIITTYVÄT TIEDOSTOT

Jos valaisimet sijoitetaan hihnan yläpuolelle, ne eivät yleensä valaise kuljettimen alustaa riittävästi, jolloin esimerkiksi karisteen poisto hankaloituu.. Hihnan

Helppokäyttöisyys on laitteen ominai- suus. Mikään todellinen ominaisuus ei synny tuotteeseen itsestään, vaan se pitää suunnitella ja testata. Käytännön projektityössä

Tornin värähtelyt ovat kasvaneet jäätyneessä tilanteessa sekä ominaistaajuudella että 1P- taajuudella erittäin voimakkaiksi 1P muutos aiheutunee roottorin massaepätasapainosta,

Länsi-Euroopan maiden, Japanin, Yhdysvaltojen ja Kanadan paperin ja kartongin tuotantomäärät, kerätyn paperin määrä ja kulutus, keräyspaperin tuonti ja vienti sekä keräys-

Työn merkityksellisyyden rakentamista ohjaa moraalinen kehys; se auttaa ihmistä valitsemaan asioita, joihin hän sitoutuu. Yksilön moraaliseen kehyk- seen voi kytkeytyä

The new European Border and Coast Guard com- prises the European Border and Coast Guard Agency, namely Frontex, and all the national border control authorities in the member

The problem is that the popu- lar mandate to continue the great power politics will seriously limit Russia’s foreign policy choices after the elections. This implies that the

The US and the European Union feature in multiple roles. Both are identified as responsible for “creating a chronic seat of instability in Eu- rope and in the immediate vicinity