• Ei tuloksia

The methods for conducting this study were largely inspired by Markus Korpi-Hallila’s 2005 study on how people end their conversations on Internet Relay Chat. In his study, Korpi-Hallila (2005) used a program called mIRC to help him create logs of the chats taking place on the Internet Relay Chat, which he afterwards analysed, looking for the material he needed on the ways to end a chatroom conversation. His study inspired me to create my own logs to gain enough material to analyse for the purposes of this study. However, because it seems mIRC is not used as often as it was roughly ten years ago, I had to use alternative methods and chatrooms to create my chat logs. My data gathering was carried out for four weeks in total, in two sessions taking over two weeks’ time during two different seasons. Data 1 was gathered from December 28th, 2015 to January tenth, 2016 and Data 2 from June 27th, 2016 to July tenth, 2016, every night for one hour. I altered the times of data gathering with one hour every night, starting at either 08:00 or 09:00 PM GMT+2 time. The first night I gathered data between 09:00 and 10:00 PM and the following night between 08:00 and 09:00 PM, then returned back to 09:00 and 10:00, and continued as such from there on. This was done to try and avoid faulty data, in case certain people arrived in the chatroom at a certain time every night and always used the same type of language. This could potentially have influenced the data one way or another. The data was gathered by painting over and then copying the text in the chatroom, afterwards pasting it to a Word document, thus creating a chat log. After finishing with gathering the data, I searched for the abbreviations in the logs both manually and with the help of the program AntConc. It felt as though it would be difficult to find or think of all the abbreviations to search for with a search engine and to find each of them in that way, so some manual processing of the data felt necessary. This was the

39

case especially for the Apostrophe-frees category where it would have been difficult to guess every occasion someone did not use an apostrophe, for example when it was lacking from the genitive formed of someone’s name. However, for some very common abbreviations such as lol and u, I used the AntConc program to help me count them, as I knew there would be many hits of them. With the manual processing of the chat logs, I first emboldened all the abbreviations I could find. Afterwards, I placed them in their correct categories by marking them up as groups of five lines at a time so the numbers would be easier for me to count. The data was analysed by looking at the way abbreviations spread over the categories through their percentages, but the two sets of data were also compared with one another.

Because one of the aims for this study was to see if any changes had taken place between the two data, I also needed to find out whether or not the changes observed were significant statistically. I looked into the possible significances of the changes through the calculation method of log likelihood.

I specifically looked into whether there were significant changes when comparing how frequently both the categories and some of their most common abbreviations took place per turn. Log likelihood can be used as a method to look into the significance in changes by inserting the correct numbers into four slots on a website (http://ucrel.lancs.ac.uk/llwizard.html), which indicate two sets of data. The total numbers of the studied corpus are inserted into two slots on a website and the parts of data that are to be examined into the slots above them. Log likelihood works well if one is looking to compare two sets of data with one another, even in the cases where the corpora may be of a different size.

Figure 1: Log likelihood calculator chart

40

The website then calculates the needed numbers, which can be judged by the given log likelihood values. The minimum value that indicates there to be a significant difference between the two data is 3.84. This number indicates that there is less than 5% chance that the differences between the two data are not significant. The differences can also be significant to a higher degree than this. 6.63 is the value needed for there to be a mere 1% chance that the differences between the two data are not significant, while 10.83 indicates a 0.1% chance and 15.13 a 0.01% chance (Rayson et al. 2004: 7).

The chart created by the calculations can also give additional information about the corpora, such as which data has more occurrences in relative terms when compared with its total size. Log likelihood was indeed used to give more precise and additional information about the numbers I was able to discover during my research. It helped find out if any significant changes had taken place between the gathering of the two data, and what types of changes.

Some of the larger differences between how this study and my BA thesis were conducted is that I was consistent with the times of day of gathering data, I gathered data for a longer period of time, both in the moment (one hour instead of 45 minutes) and across a longer period of time (two weeks instead of one). The limitations on what would be accepted abbreviations for the purposes of this study was also developed further. I was uncertain whether or not it would be useful or appropriate to accept even words such as “TV” or “USA” as abbreviations to be counted into the data when they have become general unnoticed parts of the English language. Mattiello (2013: 84) also noted this by stating that in the modern day certain words that originated as Initialisms no longer retain their capital letters spelling and for most speakers have lost their contact with the words they were originally abbreviated from. In order to refine which words to take into the data, the Oxford English Dictionary’s (OED) online version was utilised. At first I had been planning on limiting the abbreviations I would count by the time frame of when they were first used, thinking it would give me more of an idea what

41

words had already become regularised parts of English, but it turned out many of the abbreviations were much older than I had originally thought. Therefore, this plan no longer seemed like a good option, as it would not give me much data to work with. Instead, after looking over some of the abbreviations in the OED which were almost certainly going to be included in the study’s data, I decided to use colloquialism as my method of limitation. This means that if the OED stated that the abbreviation in question was colloquial language, slang or something similar to those, it was counted as a part of the data. If not, it was left out of the data. In the case of abbreviations that were not on the OED, it was assumed that they were new or unknown to such a degree that they had not yet been marked upon enough to be made a part of official language, and were also thus included in the data.

An additional thing to be noted about the handling of data in this study is that contractions such as

“don’t” and “won’t” were not counted as abbreviations. For the purposes of this study, they were considered to be too commonplace to really count, although Bieswanger (2007) evidently counted them in his study. Most contractions of this type would also have ended up in the Apostrophe-frees category as they were missing the apostrophe.

The chatroom the data was gathered from was also changed. For my BA thesis the data was gathered from ICQ chat, but I had known for a long time that I would have to gather it from another chatroom if I was going to do a larger scale study on the same topic. With ICQ, one of the biggest problems I had with gathering data was that if someone added another line to the chat while the text was painted in order to be able to copy it, the paint would disappear and I would have to do it again if I had not yet managed to copy the text I had painted. This was quite troublesome for even a small study, so I knew I would have to use a chatroom where there would be no such an issue if I wanted to keep up with the same area of study. Therefore, I looked into various international chatroom sites, and in the end chose E-Chat (http://e-chat.co/) as the source for my data. It is a free chatroom site with several different options for chatrooms, and although one does need a nickname to join the chat, you do not

42

need to be registered to the site to be able to do so. However, registered users get to have profile pictures which show up every time they say something on the chatrooms, and they can also tell something about themselves on their profiles. The data was only gathered from one of the chatrooms, called Just Chat, because it was the first option offered to people on the site and it seemed as though it would mostly entail general conversation rather than focus on specific topics or draw into it people belonging to a specific group (such as teenagers or lesbians). It also felt like the chatroom that would have the most traffic. However, the activity on the room varied quite a bit from night to night, without much explanation why. Regardless, there were always at least some active people them, many of them evidently regulars who were there almost every night. This may have influenced or distorted the data, as often it was mostly the same approximately ten people talking amongst each other.

Although there were other chatters present as well, these regulars and veterans were the most vocal in the chatrooms and contributed the most to the conversation.

E-Chat is an international chatroom open to everyone, and is mostly English-speaking. However, sometimes certain chatters would talk a bit to themselves in what I assume was their native tongue, or were asked to demonstrate their native tongue to the other chatters:

(2) Sieg_

(21:16)

todo bajo control y alla?

kittygal (21:16) heu Nessa :)

Neighbourhood Alien (21:16)

Hi to you too Mimi79 (21:16)

heu? Das ist nicht deutsch :D

43 Binotrash110001111

(21:16)

lollll your using different language lolll doctor rollins 124

(21:16) hehe kittygal (21:17)

no mucho sieg kittygal

(21:17)

nessa no haha :P Sieg_

(21:17)

bueno eso de vez en cuando esta bien (January fifth, 2016)

Conversation, however, generally took place in English because it was usually the only common language everyone in the chatroom could understand and communicate with.

The number of participants was not limited in any way for this study, in regards to their level of fluency in English or their place of origin. In order to be able to do that, more advanced methods of researching this topic would be required. I also did not deem it necessary to inform the chatters about the fact that they were being recorded, as the painting and copying the text were available as options to anyone who wished to use them. It would also have been quite difficult to inform everyone coming and going from the chatroom that they were being recorded, and it might have influenced the kinds of results the study would have given. I have also changed the nicknames of the chatters in the examples I have presented in this paper to protect their identities, and will not display any of their profile pictures provided on E-Chat. I did not take part in the conversation myself. Instead, I merely took the role of an outside observer who saved the chats into data logs. A few of the chatters attempted

44

to engage me in a private chat conversation almost every night, but I never accepted any such requests either. I also changed the nicknames with which I logged onto the chatroom every two nights.

There were some cases with handling the data where it was slightly more difficult to determine in which category an abbreviation would be the best to place. For instance, with the Letter/number homophones, the most common abbreviation, dominating the data, has been u for “you”. It was used considerably more than the full word “you”, and being a fairly versatile and commonly-used word, this multiple usage made Letter/number homophones the second most common category of abbreviations in chatting. However, the u abbreviation was also used in the other forms of the word

“you”, such as the genitive “your” (as in ur) and with “you’re” (as in u’re or ure, without the apostrophe). In such a case, I was not sure whether they should belong to the Clippings category or to the Letter/number homophones like their base form. I decided to go with the Letter/number homophones in this case as that is where the base form belongs to. However, if the apostrophe was missing in these cases, I included the abbreviation in both Letter/number homophones and Apostrophe-frees. This type of difficulty with placing certain abbreviations also occurred with the Phonetic spellings category, which would have had less occurrences if I had decided to include certain words in it into different categories. The abbreviations belonging to Phonetic spellings that occurred the most were different ways of agreement, most commonly yea and ya, which could also stand for

“you”. It was difficult to decide whether yea should be in this category or in Clippings, as it is missing the ending letter h, but I decided to put both yea and ya into Phonetic spelling as I felt like the more likely reason for someone to leave only one or two letters out would be to mimic how the word might sound in real life rather than out of lack of effort. With the current categorisations, however, ya did not actually seem to belong into any category, but I decided to include it in Phonetic spelling along with yea due to their similar types of usage and their similar forms.

45