Evaluation of the recommender system - Recommendation process

6.2 Recommendation process

6.2.3 Evaluation of the recommender system

Evaluation of recommender system has been an area of active research. Evaluation is useful in informing the selection of recommender algorithms, how they are tuned and how the system is designed but most importantly it helps in enforcing discipline about recommendations and user experience. Most research on recommender system focuses on evaluation using accuracy met-rics (e.g. Mean Absolute Error, MAE, Root Mean Squared Error, RMSE, Precision, Receiver Operating Characteristics, ROC & Recall) but (Mcnee et al., 2006), (Ziegler, McNee, Konstan,

& Lausen, 2005) & (Pu, Chen, & Hu, 2011) argues that such narrow focus had been misguided, but has even been detrimental to the field. Moreover, accuracy metrics do not judge the content of the recommended items but only looks at the accuracy of individual prediction.

In their paper, (Mcnee et al., 2006) review three other aspects that should be considered when evaluating recommender systems, one of which I used as a metric for the evaluation of this system. For (Mcnee et al., 2006) the similarity of recommendation lists, recommendation ser-endipity, and the importance of user expectations in a recommender are paramount if recom-mendation is to be done from a user-centric perspective. In a similar study by (Ge, Delgado-Battenfeld, & Jannach, 2010), they agree that “accurate recommendations may sometimes not be the most useful ones to the users, and that evaluation metrics should take into account other factors which impact recommendation quality such as serendipity and be applied to recommen-dation lists and not on individual items”. They went further to focus on metrics which takes into consideration ‘coverage’ and ‘serendipity’. Serendipity metrics measure the degree of “unex-pectedness” of recommendations made.

In the end, users are the main reason why recommendation system exist and although user sat-isfaction can be evaluated using different criteria, I believe that it depends on whether a recom-mender system provide unexpected items that are relevant to users’ needs or not. Therefore, to

consider user satisfaction, evaluation metrics must also include serendipity and novelty. PCM system aims to provide novel and serendipitous item, I focus on serendipity and consider a metric for measuring it.

I used the same approach as (Murakami, Mori, & Orihara, 2007), which is based on the idea that “unexpectedness is low for easy-to-predict items and high for difficult-to-predict items”

and the metric proposed by (Ge et al., 2010) which “determines the ratio of serendipitous rec-ommendations in the recommendation list and also takes their usefulness into account. I create two list which include (1) Items predicted by primitive prediction model (PPM) which returns high ratability i.e. items that have been rated high by users, denoted by RS and (2) items using our prediction model which returns relevant & unexpected items denoted by PM. I assume that when an item in PM cannot be found in RS it is considered unexpected and relevant. I calculate the unexpected set as follows:

𝑈𝑁𝐸𝑋𝑃 = 𝑅𝑆 − 𝑃𝑀

𝑈𝑁𝐸𝑋𝑃 represents a set containing only unexpected items. i.e. if RS and PM are two sets of items, 𝑅𝑆 − 𝑃𝑀 (UNEXP) represents items in PM not found in RS. But unexpectedness does not also mean relevance. If each item in 𝑅𝑆 − 𝑃𝑀 is represented as 𝑅𝑆𝑖, the relevance of each item is then 𝑟. 𝑟 represents the aggregated psv (see section 6.2.1) of an item (i.e. sum of each user’s psv for the item). The serendipity value of PCM’s recommendation of items set for a user is:

𝑆𝑅𝐷𝑃 = ^N₅₆₇𝑟⁵ 𝑁 where N is total value in the unexpected items list.

While evaluating, I had only a few users and over 400 news items in the database. I compared the approach used in this study to Amazon’s to-item collaborative filtering. Using item-based approach, item-based on the rating action of a user, items which are similar (using cosine similarity as in section 4.2) in content to those the user has previously rated highly are recom-mended. I selected 0.6 as a threshold when calculating the similarity between news items. To validate my approach using the above metric, I calculated the serendipity value at any point in time where 𝑃𝑀 represent recommendations generated by Amazon’s item-to-item approach and 𝑅𝑆 represents recommendations generated by using our hybrid approach. For some users, the 𝑈𝑁𝐸𝑋𝑃 value was over 30% since 𝑅𝑆 contained items not present in 𝑃𝑀. As noted earlier, this does not mean that those items are relevant to a user. Therefore, for each news story 𝑖 in the unexpected recommendations, 𝑟 represents the relevance of the item to a user. Results take into consideration the aspects which we deal with in this study (both Serendipity and relevance).

Conclusion

The web has evolved as the most powerful news delivery platform. But with the amassing con-tent, readers expect that they are only provided with stories tailored to their interest. Over the years, news aggregators have collated news from various news providers which are presented to readers. While different personalization techniques have been employed to provide users with the best news, it comes at the expense of trading off novelty and relevance for simple personalization. That is, while most aggregating system may provide news stories which meet with readers' interest, many do not provide content which are considered relevant and serendip-itous. One reason for this is that the recommendation technique used usually focus on using data from users' previous activities such as rating, searching for and viewing item.

At the beginning of this study, I indicated that one objective was to solve the ‘filter bubble’

problem, a problem resulting from what I called ‘over-personalization’ by modern personaliza-tion approaches, I do so by using a combinapersonaliza-tion of two recommendapersonaliza-tion techniques (content-based and collaborative filtering) without the use of user’s previous activities as in the case of traditional approaches. Instead, similarity is based on the profiles closeness, enhancing rele-vance and final recommendation is based on neighbors previous rating in addition to infor-mation about their profession. By combining the two approaches what I mean is – I have cre-ated a platform where both methods are employed differently in making recommendations for users. A main section which utilizes content based recommendations and a sidebar which rec-ommends items based on neighbors’ interest. Both approaches utilize IBM Natural Language Understanding for better personalization by semantically extracting information from news sto-ries which are useful for the system’s recommendation process. Although this study, utilizes Watson for just getting taxonomies from a news content, there are other areas in which it could

be used alongside optimal predictive analysis to enhance better personalization such as getting historical information about concepts in a news story as the reader reads or providing readers with information regarding the emotions expressed by the news author or even information on sentiments expressed (whether positive or negative). There were a few times in this study, where Watson analysis of a news story didn't produce the correct category for that story. In that case, the solution to improving the accuracy of Watson Natural Language Understanding is by customizing text analysis with custom models using Watson Knowledge Studio.

While the approach used in this study solves the problem of data sparsity by using users’ pro-files, it doesn't yet address the issue of scalability which may arise because of many users. That is, when the volume of data set is huge, computational load would be very high. Therefore, nearest-neighbor algorithms usually require that they scale with both the number of users and items. Some solutions to this problem have been proposed in the past by researchers, for exam-ple (Zhi-Dan Zhao & Ming-Sheng Shang, 2010) suggested the use of cloud computing plat-forms such as Hadoop (which implements the MapReduce framework), Dryad of Microsoft, or Amazon’s Dynamo. According to them, by using a platform such as Hadoop, the program can execute in parallel while using MapReduce to break big problems into smaller ones thus im-proving speed. (Siavash & Ali, 2011) also proposed a “method which features hybridizing den-sity-based clustering and user-based collaborative filtering”. In their approach, after prepro-cessing (getting similarity measures based on demographic attributes such as age and gender of users), users are clustered based on these attributes and each cluster is used as an input to user-based collaborative filtering. (Lee, Choi, & Woo, 2002) proposed a system which combines collaborative filtering(CF) with Self-Organizing Map (SOM) neuralnetwork. Therefore, in fu-ture work, I intend to implement neural network into PCM to enhance recommendations which would allow for diverse kinds of information to be integrated, application of users’ demographic information (age, sex etc.), various data types considered and information inefficiency handles

properly. This is would mean a more complex system that meets users’ needs even more accu-rately. Overall, this study addresses the initial questions posed at the beginning. In chapter 5, I elaborated in details the various recommendation processes and the reasons behind my choice of algorithm. In chapter 2, I discussed why relevant and serendipitous contents is necessary. By using a hybrid system of user-based collaborative filtering and content based filtering, novelty needs for users is meet, recommendation is better by using a demographic attribute of users (profession) and the relevancy level of recommended content is optimal.

Conclusively, the news industry is undergoing a wide-ranging transformation with the hype around social media and online instant video recording of events. This means that anyone can create their own news stories and push it to the internet. In such scenario, personalization and relevance are important factors to be considered if news content is to be delivered right. More so, In the future, I think it is news companies that leverage the capabilities of the semantic web for personalizing content by linking all entities representing places, things or people for ma-chine readability that would get the most audience.

Acknowledgement

Foremost, my sincere gratitude goes to my supervisors Dr. Oleksiy Khriyenko & Vagan Terziyan for their patience, motivation, support and immense knowledge. Their guidance helped me throughout the writing of this thesis. I couldn’t have imagined better advisers and mentors for this study.

Besides my supervisors, I would like to show my appreciation to the Faculty of Information Tech-nology where I studied the Web Intelligence and Service Engineering program. The study structure is designed in a way that facilitates knowledge in developing intelligence software applications for web services. I have no regret in choosing this course program.

My sincere thanks also go to Truth Lumor, for the thought provoking discussions we had about this subject area. Also, I thank my teacher Fergal Carolan for helping me proofread and better structure this thesis.

Lastly, I thank my parents Henry Okojie and Blessing Okojie for their sponsorship and immense support both in my studies and career.

Charles Osemudiame Okojie Jyväskylä, February 22, 2018

Bibliography

Adam, P. (2016). Extracting Content From the Chaos of the Web: Introducing the Mercury Web Parser. Retrieved August 16, 2017, from

https://trackchanges.postlight.com/extracting-content-from-the-chaos-of-the-web-introducing-the-mercury-web-parser-e920a1db7f86

Aggarwal, C. C. (2016). Neighborhood-Based Collaborative Filtering. In Recommender Systems (pp. 29–70). Cham: Springer International Publishing.

https://doi.org/10.1007/978-3-319-29659-3_2

Balanovic, Marko; Shoham, Y., Balabanović, M., & Shoham, Y. (1997). Content-Based, Collaborative Recommendation. Communications of the Acm, 40(3), 66–72.

Bogart, L. (2004). Reflections on content quality in newspapers. Newspaper Research Journal, 25(1), 40–53. Retrieved from

http://search.ebscohost.com/login.aspx?direct=true&db=ufh&AN=12728024&site=eh ost-live

Bonhard, P., & Sasse, M. A. (2006). ’Knowing me, knowing you’ — Using profiles and social networking to improve recommender systems. BT Technology Journal, 24(3), 84–98. https://doi.org/10.1007/s10550-006-0080-3

Borlund, P. (2003). The concept of relevance in information retrieval. Journal of the American Society for Information Science and Technology, 54(10), 913–925.

https://doi.org/http://dx.doi.org/10.1002/asi.10286

Chowdhury, S., & Landoni, M. (2006). News aggregator services: user expectations and experience. Online Information Review, 30(2), 100–115.

https://doi.org/10.1108/14684520610659157

Crockford, D. (2006). The application/json Media Type for JavaScript Object Notation (JSON). Network Working Group, 1–10. Retrieved from

https://tools.ietf.org/pdf/rfc4627.pdf

Daniel, G.-P., Lourenco, A., Lopez-Fernandez, H., Reboiro-Jato, M., & Fdez-Riverola, F.

(2013). Web scraping technologies in an API world. Briefings in Bioinformatics, 15(5), 788–797. https://doi.org/10.1093/bib/bbt026

Fan, X., Mostafa, J., Mane, K., & Sugimoto, C. (2012). Personalization is not a panacea:

Balancing serendipity and personalization in medical news content delivery. IHI’12 - Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, 709–713. https://doi.org/10.1145/2110363.2110445

Ge, M., Delgado-Battenfeld, C., & Jannach, D. (2010). Beyond accuracy. In Proceedings of the fourth ACM conference on Recommender systems - RecSys ’10 (p. 257). New York, New York, USA: ACM Press. https://doi.org/10.1145/1864708.1864761 Gill, K. E. (2005). Blogging, RSS and the Information Landscape: A Look At Online

News. University of Washington , (WWW 2005 workshop on the weblogging ecosystem), 7. Retrieved from http://www.ramb.ethz.ch/CDstore/www2005-ws/workshop/wf10/gill.pdf

Giri Kumar, T., & Donald, P. B. (1998). Examining data quality. Magazine Communications of the ACM, 41(2), 54–47. Retrieved from

http://delivery.acm.org/10.1145/270000/269021/p54-tayi.pdf?ip=130.234.207.222&id=269021&acc=ACTIVE

SERVICE&key=74A0E95D84AAE420.06A1DC718DC957B2.4D4702B0C3E38B35 .4D4702B0C3E38B35&CFID=758118410&CFTOKEN=79548077&__acm__=1493 835436_970cd4e62e4c268fa

Glez-Peña, D., Lourenço, A., López-Fernández, H., Reboiro-Jato, M., & Fdez-Riverola, F.

(2014). Web scraping technologies in an API world. Briefings in Bioinformatics, 15(5), 788–797. https://doi.org/10.1093/bib/bbt026

Greg, L., Brent, S., & Jeremy, Y. (2003). Amazon.com Recommendations Item-to-Item Collaborative Filtering. EEE Internet Computing. Retrieved from

https://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf

Haddaway, N. R. (2016). The use of web-scraping software in searching for grey literature.

Grey Journal, 11(February), 186–190. Retrieved from

https://www.researchgate.net/profile/Neal_Haddaway/publication/282658358_The_U

se_of_Web-scraping_Software_in_Searching_for_Grey_Literature/links/5616886708ae0f2140072 4e1.pdf

High, R. (2012). The Era of Cognitive Systems: An Inside Look at IBM Watson and How it Works. International Business Machines Corporation, 1(1), 1–14. Retrieved from http://www.redbooks.ibm.com/redpapers/pdfs/redp4955.pdf

Hsu, D., & Shigetoshi, L. (2011). The Filter Bubble. Retrieved from http://web.cs.ucdavis.edu/~rogaway/classes/188/fall11/p204.pdf

Hu, Y., Koren, Y., & Volinsky, C. (2008). Collaborative Filtering for Implicit Feedback.

IEEE International Conference on Data Mining, 263–272.

https://doi.org/10.1109/ICDM.2008.22

IBM. (2017a). Natural Language Understanding | Overview | IBM Watson Developer Cloud. Retrieved April 14, 2017, from

https://www.ibm.com/watson/developercloud/doc/natural-language-understanding/

IBM. (2017b). Watson Discovery News. Retrieved November 18, 2017, from https://console.bluemix.net/docs/services/discovery/watson-discovery-news.html#watson-discovery-news

Jain, N., Mangal, P., & Mehta, D. (2014). AngularJS: A Modern MVC Framework in JavaScript. Journal of Global Research in Computer Science, 5(12), 17–23. Retrieved from http://www.jgrcs.info/index.php/jgrcs/article/view/952

Judith, W. (1983). RSS: the latest feed. [Pierian Press]. Retrieved from http://www.ucd.ie/wusteman/lht/wusteman-rss.html

Khriyenko, O. (n.d.). Lecture 1: IBM Watson cognitive computing services. Retrieved from http://users.jyu.fi/~olkhriye/itka352/lectures/Lecture-01.pdf

Khriyenko, O., Kaykova, O., Kovtun, D., Naumenko, A., Terziyan, V., & Zharko, A.

(2005). General Adaption Framework: Enabling Interoperability for Industrial Web Resources. International Journal on Semantic Web and Information Systems, 1(3), 31–63. https://doi.org/10.4018/jswis.2005070102

Kim, K., & Meyer, P. (2005). Survey Yields Five Factors Of Newspaper Quality.

Newspaper Research Journal, 26(1), 6–16.

Kim, M. W., Kim, E. J., & Ryu, J. W. (2004). A Collaborative Recommendation Based on Neural Networks (pp. 425–430). Springer, Berlin, Heidelberg.

https://doi.org/10.1007/978-3-540-24571-1_39

Klein, B. (2002). When Do Users Detect Information Quality Problems on The World Wide Web? AMCIS 2002 Proceeding, 4.

Knight, S.-A., & Burn, J. (2005). Developing a framework for assessing information quality on the World Wide Web. Informing Science Journal, 8(3), 159–172. Retrieved from http://www.citeulike.org/user/proborc/article/2819789

Krijnen, D., Bot, R., & Lampropoulos, G. (2004). Automated Web Scraping APIs, 5.

Retrieved from

http://mediatechnology.leiden.edu/images/uploads/docs/wt2014_web_scraping.pdf

Kuflik, Vania Dimitrova Tsvi, David Chin Francesco Ricci, and P. D. G.-J. H. (2012).

User Modeling, Adaptation, and Personalization. Canada: 20th International Conference. Retrieved from

Kumar Nandanwar, A., & Pandey, G. S. (2012). Content Based Recommendation System Using SOM and Latent Dirichlet Allocation Model. International Journal of

Computer Science and Information Technologies, 3. Retrieved from

https://pdfs.semanticscholar.org/632c/ac44a186a3ab396eb41c0c5433e11b8c2a40.pdf Lacy, S., & Fico, F. (1991). The link between newspaper content quality & circulation.

Newspaper Research Journal, 12(August 1989), 46–57.

https://doi.org/10.1177/073953299101200206

Lee, M., Choi, P., & Woo, Y. (2002). A Hybrid Recommender System Combining Collaborative Filtering with Neural Network. In Adaptive Hypermedia and Adaptive web - based systems (pp. 531–534). Springer, Berlin, Heidelberg.

https://doi.org/10.1007/3-540-47952-X_77

Li, W., Moore, A. W., & Canini, M. (2008). Classifying HTTP Traffic in the New Age (Poster). Proceedings of the 2008 ACM SIGCOMM, 479–480. Retrieved from http://www.reti.dist.unige.it/~marco/papers/http.p-sigcomm08.pdf

Lops, P., de Gemmis, M., & Semeraro, G. (2011). Content-based Recommender Systems:

State of the Art and Trends. In Recommender Systems Handbook (pp. 73–105).

Boston, MA: Springer US. https://doi.org/10.1007/978-0-387-85820-3_3

Lynn, P. (2010). Eli Pariser on the future of the Internet - Salon.com. Retrieved August 17, 2017, from http://www.salon.com/2010/10/08/lynn_parramore_eli_pariser/

MacCatrozzo, V. (2012). Burst the filter bubble: Using semantic web to enable serendipity.

Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7650 LNCS(PART 2), 391–398.

https://doi.org/10.1007/978-3-642-35173-0-28

Mcnee, S. M., Riedl, J., & Konstan, J. A. (2006). Being Accurate is Not Enough: How Accuracy Metrics have hurt Recommender Systems. Retrieved from

http://delivery.acm.org/10.1145/1130000/1125659/p1097-mcnee.pdf?ip=130.234.245.164&id=1125659&acc=ACTIVE

SERVICE&key=74A0E95D84AAE420.06A1DC718DC957B2.4D4702B0C3E38B35 .4D4702B0C3E38B35&CFID=831042046&CFTOKEN=18982400&__acm__=1511 892558_23889b78346864af4c505c7196aae050

Mozilla. (2017). API - Glossary | MDN. Retrieved August 15, 2017, from https://developer.mozilla.org/en-US/docs/Glossary/API

Murakami, T., Mori, K., & Orihara, R. (2007). Metrics for Evaluating the Serendipity of Recommendation Lists. In New Frontiers in Artificial Intelligence (pp. 40–46).

Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-78197-4_5

Nanyang Technological University. (n.d.). In Introduction to HTTP Basics. Retrieved October 26, 2017, from

https://www.ntu.edu.sg/home/ehchua/programming/webprogramming/HTTP_Basics.

html

Nielson Global Survey. (2015). The Facts Of Life: Generational Views About How We Live. Retrieved August 27, 2017, from

http://www.nielsen.com/us/en/insights/news/2015/the-facts-of-life-generational-views-about-how-we-live.html

Node.js Foundation. (2017). Simple server side cache for Express.js with Node.js.

Retrieved August 16, 2017, from https://medium.com/the-node-js-collection/simple-server-side-cache-for-express-js-with-node-js-45ff296ca0f0

Parasie, S., & Dagiral, E. (2013). Data-driven journalism and the public good: “Computer-assisted-reporters” and “programmer-journalists” in Chicago. New Media & Society, 15(6), 853–871. https://doi.org/10.1177/1461444812463345

Pariser, E. (2011). The filter bubble : what the Internet is hiding from you. Penguin Press.

Peffers, K., Tuunanen, T., Rothenberger, M. A., & Chatterjee, S. (2007). A Design Science Research Methodology for Information Systems Research. Source Journal of

Management Information Systems, 24(3), 45–77. https://doi.org/10.2753/MIS0742-1222240302

Phelan, O. ;, Mccarthy, K. ;, Smyth, B., Phelan, O., & Mccarthy, K. (2009). Using Twitter to recommend real-time topical news Using Twitter to Recommend Real-Time Topical News *, 22–25. https://doi.org/10.1145/1639714.1639794

Pu, P., Chen, L., & Hu, R. (2011). A user-centric evaluation framework for recommender systems. In Proceedings of the fifth ACM conference on Recommender systems - RecSys ’11 (p. 157). New York, New York, USA: ACM Press.

https://doi.org/10.1145/2043932.2043962

Rauch, G. (2012). Smashing Node.js : JavaScript everywhere. John Wiley & Sons Inc.

Retrieved from

https://books.google.fi/books?hl=en&lr=&id=G1y_5kpmatUC&oi=fnd&pg=PT21&d

q=nodejs&ots=uSo7T6SAZ1&sig=Dec-xu4wfVNJncNk8RC9FEWbogw&redir_esc=y#v=onepage&q=nodejs&f=false Rebekah Monson. (2014). Data Journalism Explication - Process - YouTube. Youtube.

Retrieved from https://www.youtube.com/watch?v=BkFPuyfjAdE

Rich Jaroslovsky: Part 1 - The Future of News : The Future of News. (n.d.). Retrieved March 14, 2017, from http://futureof.news/episodes/rich-jaroslovsky/

Sally, A. (2016). How can Facebook and its users burst the “filter bubble”? | New Scientist.

Retrieved May 23, 2017, from https://www.newscientist.com/article/2113246-how-can-facebook-and-its-users-burst-the-filter-bubble/

Saracevic, T. (1996). Relevance reconsidered. Proceedings of the Second Conference on Conceptions of Library and Information Science (CoLIS 2).

https://doi.org/10.1002/(SICI)1097-4571(199404)45:3<124::AID-ASI2>3.0.CO;2-8

In document Enhancing news recommendation using a personalized content manager (sivua 72-83)