• Ei tuloksia

Review of research about learning analytics in smart learning environments for programming education

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "Review of research about learning analytics in smart learning environments for programming education"

Copied!
50
0
0

Kokoteksti

(1)

Review of research about learning analytics in smart learning environments for programming education

Santosh Deshmukh

Master's thesis

School of Computing Computer Science

August 2021

(2)

UNIVERSITY OF EASTERN FINLAND, Faculty of Science and Forestry, Joensuu School of Computing

Computer Science

Santosh Deshmukh: Educating Students about Learning Analytics in Smart Learning Environments

Master’s Thesis

Supervisors of the Master’s Thesis: Dr. Solomon Oyelere and MSc Agbo Friday Joseph

August 2021

Abstract

In the present era, Smart Learning Environment (SLE) has been integrated firmly with education and work. The age of digital transformation requires more research to find the best possible ways of informal learning. Nowadays, learners have different smart services, tools, and other educational options through which they can learn. Since students need an innovative environment for a 21st-century learning experience, development of smart learning environments became necessary.

This study collects different kinds of data in the smart learning environment for programming education and conducted a literature review to investigate existing methods for improving student learning abilities in programming education through learning analytics deployed within smart learning environments. Findings from this research suggests that each integrated development environment (IDE) or smart earning environment behaves differently, depending on features available in such an environment.

Findings in this study includes the taxonomy of the kinds of data that are collected from smart learning environments. The study also revealed that only a small amount of data is currently collected and investigated in IDE-based Learning Analytics. Therefore, was be recommended for future research should focus more on the growth of IDE facilities wherein a broader spectrum of data is gathered and evaluated.

Keywords: Learning Analytics, Integrated Development Environment, IDE & Sensors, Multimodal Data Collection and Smart Learning Environment.

(3)

List of abbreviations

SLE Smart Learning Environment LA Learning Analytics

IDE PE MDC SL RQ

Integrated Development Environment Programming Environment

Multimodal Data Collection Smart Learning

Research Question

(4)

Table of contents

Abstract ... ii

List of abbreviations ... iii

Table of contents ... iv

List of figures ... vi

List of tables ... vii

1 Introduction ... 1

1.1 Background ... 1

1.2 Theoretical framework ... 2

1.3 Research Objectives ... 2

1.4 Research Questions ... 3

1.5 Research Methodology ... 3

1.6 Structure of the Thesis ... 4

1.7 Summary of the chapter ... 5

2 Literature review ... 6

2.1 Programming education ... 6

2.2 IDE used in Programming Education ... 7

2.3 Smart Learning Environment ... 14

2.4 Learning Analytics ... 16

2.5 Smart knowledge management ... 16

2.6 Why Learning Analytics in SLE ... 17

2.7 Benefits of Learning Analytics ... 20

3 Methodology ... 21

3.1 Research context ... 21

3.2 Data Collection Procedure ... 21

3.3 Search Databases ... 21

3.4 Search Terms ... 21

3.5 Selection of articles ... 22

3.6 Data Analysis ... 23

4 Results ... 24

(5)

4.1 Types of data collected in Smart Learning Environment for Programming

Education ... 24

4.2 The data from IDE and sensors which can be used to find innovative methods for improving student learning abilities in programming education ... 28

4.3 The methodologies in Learning Analytics that support knowledge extraction from data available in Smart Learning Environment ... 31

5 Discussion ... 33

5.1 RQ 1. What kind of data should be collected in the Smart Learning Environment for assisting programming education? ... 33

5.2 RQ 2. In which way can way can the data from IDE and sensors be worked for extracting information and to find innovative methods for improving student learning abilities in programming education. ... 34

5.3 RQ3. What are the methodologies in Learning Analytics to extract knowledge from data available in smart learning environment? ... 36

6 Conclusion and Recommendations ... 37

7 Future work ... 37

References ... 38

(6)

List of figures

Figure 1. Blackbox Component Builder ... 8

Figure 2. Blue J ... 8

Figure 3. ClockIt BlueJ Data Visualiser summary ... 9

Figure 4. ClockIt Web Interface Project Size Graph ... 10

Figure 5. Structural Overview of DevEvent Tracker [7] ... 11

Figure 6. Sample Project using Web CAT ... 12

Figure 7. Experiences with Marmoset [8] ... 13

Figure 8. Communication pathway in Hackystat [8] ... 14

Figure 9. Learning in smart environment ... 17

Figure 10. Healey and Jenkins Model 2009 ... 18

Figure 11. Summary of the systematic review execution process ... 24

Figure 12. Taxonomy of design methodology in Learning Analytics ... 32

(7)

List of tables

Table 1. Publishers and articles based on search terms. ... 22

Table 2. Categories with inclusion and exclusion ... 22

Table 3. Comparison of IDEs used in Programming Education ... 27

Table 4. Categorical data category, technique, and occurrences ... 30

(8)

1 Introduction

Smart education is potentially a ground-breaking component in the current digital age.

Smart learning has gained increased attention as a complete technology [1]. Smart Learning does not rely on content and material availability, and on the other hand, traditional learning depended on and is far from the reach of smart devices.

1.1 Background

Smart learning environments have the characteristics of ubiquitous learning and advantage of social tools and wireless communique of mobile tools. SLE provides learners with hands-on experience under a limited number of resources also. Smart learning is a technology that assists and enhance digital resources. These contextually aware and adaptive devices can be widely utilized to help different students depending upon their learning capabilities and learning processes. Increasingly around the world, there is recognition of the SLE for connecting the learning communities [2].

In addition, SLE is a highly adaptive technology. The primary reason is that learning platforms respond according to an individual who also increases learners' productivity and can be grown at a bigger scale [1]. However, this field still needs research to make it more productive for students to learn smartly. More specifically, deep research is required to develop a better smart computing environment.

Learning Analytics can be defined as measured, accumulated, studied, and submitted data about learners to improve the learning environment. Furthermore, it provides futuristic predictions from the collected data by mapping current trends related to the context and enhancing the learning process.

The primary motivation behind this work is to provide information empowerment to learners, tutors, and professors about the usefulness, objectives, analyse and efficiency of the educational sources. The goals are set for enhancing recent developments for learning analytics, deficiencies, well-defined evaluation of the essential ideas and literature in this

(9)

domain. It also focuses on deriving elaborate conclusions from research studies and constructing additional knowledge base.

Student engagement is a perpetual issue around the globe, and students often feel uninspired and bored during their studies. SLE encourages robust connectivity between students and facilitators anytime and anywhere. Learning Analytics customize the flow of information and provide personalization, as well as optimizes the vast volume of data to improve the productivity of students. On the other hand, an integrated development environment (IDE) performs a dual job collecting data and enhancing the learning experiences. These factors increase the willingness of learners.

1.2 Theoretical framework

This study aspires for presenting a well-structured measurable assessment of the most significant works published in Learning Analytics for Smart Learning Environment for programming education. It also aims to draw broader conclusions from quantitative, experimental research studies and develop a more cumulative knowledge base. This review is concluded on behalf of a wide range of research papers and journals.

Although we have seen Smart learning environment (SLE) has been researched to enhance teaching and learning by providing personalised education, quick feedback, motivation and learning support. Yet, there have been studies that provide crucial information about the issues faced in Learning Analytics, for instance data ownership, data privacy, transferability, data protection and scalability [3].

1.3 Research Objectives

The concept of a smart learning environment is to make a forum where innovation helps the learning process. It is not there to make things more complicated but to improve the learning experience. The idea of SLE is not to make the learning process harder but to improve it in a sophisticated way. The main objectives of this study include:

• To characterize the different types of data that can be collected from IDE within a smart learning environment.

• Investigate best algorithms for extracting knowledge for decision-making related to learning output in programming education.

(10)

This exploration will be data-oriented, and the different processes related to Smart Learning Environment for enhancing programming education. This research will explore the entire analytical process through the utilization of data from IDE. Relevant sources related to analysis like articles, published research papers, books and various such are required to examine the Programming Education.

1.4 Research Questions

The following questions will be answered in this study.

1. What kind of data are collected in Smart Learning Environment for helping programming education?

2. What way can the data from IDE and sensors of Smart Learning Environment be modified to provide innovative methods for improving student learning experience of programming education?

3. What are the methodologies in Learning Analytics that support knowledge extraction from data available in Smart Learning Environment?

1.5 Research Methodology

A smart learning environment (SLE) is relevant to programming education since it supports ubiquitous and personalized learning. Therefore, it is important to study Smart Learning Environment and Learning Analytics in programming education through a Literature review.

The features of SLEs include adapting to learners’ preferred ways of learning, context awareness, ubiquity, and intelligent feedback mechanism [4]. It would be beneficial for Academicians, students, and stakeholders. Therefore, 265 research papers were considered and downloaded across different portals, for instances Google Scholar, ACM, Springer, and IEEE.

A different variation of keywords (Smart Learning Environment, Learning Analytics, Programming Education) was considered from Google Scholar. Furthermore, these terms

(11)

had searched inside the articles, and 108 relevant articles were found. These articles contain the above keywords in the title, introduction, or main chapters, taken for literature review.

In addition, 58 relevant articles were downloaded from ACM Digital Library through the database using learning analytics, Learning Education, Smart Environment, and programming education. Fifty papers were identified from Springer after studying entire articles. These articles are systematically arranged in a folder to validate for the supervisor. The authenticity of articles can be verified by checking such keywords Smart Learning Environment, Programming, Learning Analytics in this database. These are inserted in the conditions table as a general keyword, so only related articles can be downloaded. The next step ensures all articles are mentioned in a Microsoft spreadsheet containing columns name Title, Publication, Year, Idea and Database.

In article [5], the authors discussed the pros and cons of the Traditional Literature review, which was effective in terms of qualitative interpretation for analysis of the research findings. Meta-Analysis focuses on the positives of the research studies and finds the relationships that might be left out in other methodologies. In a Meta-Analysis review, it can be possible to generalize and assemble different results from different studies together than the usual approach. In the same manner, we followed wherein we selected a topic and the database for this study. To narrow down few databases, this research work would be used. This thesis contains extensively downloaded articles and maintains an excel sheet with the different subjects for making it easier to study the articles and figure out their content. Smart learning environments are empowering instructors to implement substantially more versatile learning modules than before.

1.6 Structure of the Thesis

The order of this thesis is:

1. Problem description, research objective, research questions, research process and methodologies, Motivation and Acknowledgement.

2. Literature review of Smart Learning Environment, Learning Analytics, Programming Education, usages, and methods.

(12)

3. Brief description of research methodology, research context, data collection methods, different database selections, and search terms.

4. Results from Research questions which is asked during the process.

5. Discuss the findings and interpretations from the research questions and their significance for making better Programming Education by inculcating more using a Smart Learning Environment.

6. The conclusion includes the guidelines and suggestions, which is helpful for the reader for better understanding concisely and clearly.

1.7 Summary of the chapter

The introduction contains seven chapters, which are as follows: Section 1.1 is based on background and context, which deals with Smart Learning Environment, Learning Analytics and led to a literature review. Section 1.2 deals with the main motive behind the research studies. Section 1.3 shows the research objective and desired results from the literature review study. Section 1.4 contains the research questions and ways to find out their answers. Section 1.5 presents the research methods to breaks the analysis into parts.

Section 1.6 shows the structure of the complete thesis for creating a summary.

(13)

2 Literature review

This review contains elements and keywords related to Smart Learning Environment, Programming Education, IDE, Learning Analytics, and sensors. Many databases have been searched for finding the articles associated with the review and journals. The literature review presents different methodologies involved in Learning Analytics which are beneficial for knowledge extraction from available data in SLE.

The study of different types of data are obtained from SLE for enhancing Programming Education. This study is combination of various ways to find fruitful programming education and productive Smart Learning Environments. The review protocol is designed in such a way, the research questions describe the critical areas of focus of the study.

2.1 Programming education

Programming education is a vast discipline where students are taught to design and build computer programs to attain a specific computer output. It teaches methodologies for analyzing, constructing algorithms and teaches ways of implementing those algorithms in a particular programming language. Programming Education is a quite challenging and vital element for teaching programming courses. More and more students are getting interested in coding, and they are pursuing it as formal education all around the world.

Programming was earlier a technique that was sought after by only a handful group of IT professionals. However, the current rate of digitalization wherein Software has become a mainstay in our everyday life has changed the perspective. It has become essential for all the students to have computational thinking (CT) [35][37][42], i.e., is problem-solving skills based on computer science that can be used by computers to be used in the process.

As a result, programming has been introduced in the school and college curriculum.

Computer Programming education has a few problems and shortcomings that programming novices face during their studies. Programming education is quite dynamic, while teaching professors and teachers sometimes don’t take students learning styles and preferences into consideration., some students may have different ways of grasping the programming concepts1. It requires a good understanding of practical problem-solving

1 https://en.wikipedia.org/wiki/Computer_programming

(14)

techniques [36][48] and experimental and exhaustive study, unlike the usual theoretical courses[6].

IDE (Integrated Development Environment)1 is a software application that assists computer programmers in software development. IDEs increase programmer productivity by combining everyday software activities into a single application: editing source code, building executables, and debugging. IDE provides a single program that can help in the development phase.

One objective of IDE is to moderate the alignment important for stitching together multiple. It assists in providing a similar set of capabilities as one interconnected entity.

Doing so helps reduce the arrangement time and eventually improves software developer output, wherein learning to utilise IDE is quicker than manually incorporating and learning about the individual tools. Firm incorporation of all the development tasks can enhance overall efficiency beyond assisting the setup tasks. For example, code can be continuously examined. IDEs are the place where learning process data can be collected.

It can be used for providing learning interventions that can be utilised for improving students.

2.2 IDE used in Programming Education

2

BlackBox + BlueJ IDE: This IDE was developed by the developers of BlueJ, an educational Java Development environment for formalising data collection to better understand BlueJ functioning and to better grasp students’ ways of learning programming. Blackbox collects activity data from BlueJ users, including source code, edit sequences, testing, and execution interactions and compilation results.

1 https://www.codecademy.com/articles/what-is-an-ide

2 https://en.wikipedia.org/wiki/Integrated_development_environment

(15)

Figure 1. Blackbox Component Builder

Figure 2. Blue J

(16)

ClockIt for BlueJ: ClockIt helps monitor and maintain logs of student software development activities and allows them to understand which method helps them in their successful software development process. It is helpful to determine when the assignment was started by the student, the time of distributed work, and the time taken to complete the assignment. This data is used to analyse students' understanding of computing concepts, whether they find it easy or difficult. Furthermore, considering the students are spending large amounts of time on an assignment.

Figure 3. ClockIt BlueJ Data Visualiser summary

(17)

Figure 4. ClockIt Web Interface Project Size Graph

DevEventTracker Eclipse interacts with existing Web-CAT for data collection from Eclipse IDE. At the same time, students keep programming, giving in-depth insights into students programming. It is a system designed to collect fine-grained data about the student development process, and it compiles data about students writing and running their tests. In addition, it provides insights into student’s autonomous software development habits. Different Software development and compile steps are traced within the IDE and then moved to WebCAT server. In addition, code screenshots matching to event are also transferred to the server-side.

(18)

Figure 5. Structural Overview of DevEvent Tracker [7]

WebCAT stands for Web-based Centre for Automated testing (Web CAT), which was by designed Virginia Tech university to help submit and test students' code. It is an open- source code project which allows the Universities to build up the class and assignment on the server and helps in the submission of the code. It promotes test focused events. Web CAT tracks the code alongside instructor produced reference tests and student’s unit tests.

Multiple submissions, including the student’s result code levels are allowed and followed.

Instructors can review the code submitted and edit and confirm the scores. Web CAT is a highly recommended platform for constant data collection as it is utilised by different

(19)

universities worldwide. Web-CAT is a perfect contender for an extension to the continuous collection of data and processing it.

Figure 6. Sample Project using Web CAT1

Marmoset is an automatic scoring system developed by the Maryland University like Web-CAT. Marmoset collects student code by means of an Eclipse plugin and stores it in a Concurrent Versioning System (CVS) repository every time. Though this attribute is workable for solitary-handler projects, provides an auto reserve to students & permits researchers to supervise trials on different code editions, for instance number of mistakes, threats staged, eased, or persists from one screenshot to next. Images are collected into

“effort periods” to verify roughly the time students executed their assignment. Analysis

1 https://sourceforge.net/projects/web-cat/

(20)

is piloted independently from data collection, and there is no systematised examination done on single screenshots or their changes.

Figure 7. Experiences with Marmoset [8]

Hackystat is an open-source project which commenced at the University of Hawaii in 2001 which was developed for delivering product, process dimensions in a variety of software engineering situations, like industry, education. Hackystat contains various sensors combined six into the user’s environment that triggers detailed actions. As in Hackystat, sensors mention the code i.e., curved into the development phase that notice precise variations & notes data regarding the event when activated. For example, there are sensors for spotting alterations in the existing file, sensing unit tests output, & logging outcomes after compilation and debugging. This information is automatically directed (after connecting to the central server) & accumulated for study. Additionally, it is saved in log form for future broadcast when it is not connected to the server. Hackystat has provision for multi-user projects and delivers consumers with a regularly apprised web

(21)

dashboard. Users can log in and check all their data collected from every involved project.

Figure 8. Communication pathway in Hackystat [8]

2.3 Smart Learning Environment

Smart Learning Environment gives an innovative experience that provides information by utilising accessible technical assets and thereby providing sophisticated learning experiences. It contains high degrees of learning assistance and has abilities for personalizing and customizing features, collaborative ability, knowledge gathering skills and knowledge transfer [39-45]. These abilities are synchronized with the distributed learning environment and stakeholders needs. Furthermore, innovative Learning Environment creates a simulation of real-world scenarios. To achieve maximum optimization in Smart Learning Environments can be achieved with complete optimization of the following elements.

1. Adaptation of Learning process: An action always occurs between learner and the learning environment [9], when this action occurs, the learning system attempts to utilise the information about the learner to provide different assignments based on their progress, aptitude, and their liking by using a user model. Consequently, the learners are always engrossed in the learning activities and conducive to their framework. There should always be scope for adaptation and custom changes for suiting of interests of learners.

2. Personalization: Each interaction with a studying system prompts an action [9]. Once the action is initiated, the system adapts and assign tasks to learners based on their

(22)

progress, performance, and liking utilizing a user model. The idea is to keep learners engaged in the learning tasks fitting their parameters. Customization should be considered an extension to automate adaptation on behalf of the learners to satisfy their preferences and goals. Therefore, personalization in Learning Systems should always provide scope for alteration and customization of Learner’s processes depending on the required essential features for the learners [10]. In the conventional learning systems, the more focus is on users being treated as a homogeneous entity but on the other hand, and personalised learning treats learners as a heterogeneous individuals [11]. Learner’s requirements and attributes (previous information, intelligence and grasping capacity, learning methods) are also equally crucial elements for personalization, according to 5/21 page of the article [12].

Customization is performed for Learners (at least with their consent if not with their direct feedback for executing important learner-oriented ways. As a result, in the process of building the capabilities by learning environment feedback of both the teachers, creators of the studies but also students are extremely important. Thus, personalized information can be collected in three ways for customizing.

1. Firstly, surveys identifying exactly students need their goals as well.

2. By providing options, wherein students could choose their objectives and time duration.

3. Observing and noting the learner behaviour and analysing it so that fruitful data can be extracted from it about their selection and their operating patterns [13].

As a result, for personalization purposes, individual data can be collected threefold:

(1) by directing an early survey exploring students hopes and determination.

(2) by offering personalized settings for individual goal setup and the preferred timeframe.

(3) by following students’ activities & obtaining significant data about students' choices and their navigational patterns history.

(23)

2.4 Learning Analytics

Learning Analytics can help observe activities of students and improvements over time by using data visualization that provides data in return to the learners [14]. A clear understanding of these suggestions can help the learners and teachers to make constructive and data-oriented conclusions. An accurate visualization is obtained from the analytics of different learning systems [15].

Hence despite the technological hurdles, learning analytics is unhindered to learners’

dealings with the system and observe the entire computer interaction of students. These are vital modules while developing a dashboard of learning analytics [16] data awareness;

activity level data visualisations; self-assessment on activities; understandable; in association with other learners; goal-oriented visualizations etc.

2.5 Smart knowledge management

E-Learning environment has to amass and obtain data, which is crucial & useful in making conclusions & deciphering issues[17]. E-Learning generates a vast data collection that, through a structured information organisation process, which can enhance the learning experience. Diverse methods can be utilized to accumulate & observe varied important data for learning experiences, such as big data analytics, educational data mining, semantic web, and knowledge maps [18].

Considering these components in SLE provide ways for learners in obtaining various skill sets and reach their learning outcomes [19] which helps the tutors for a tool for tracking, observing, and enhancing existing teaching methodologies. From the article [12], Figure 9. shows the control of smart content lies in between every possible aspect of the smart environment.

(24)

Figure 9. Learning in smart environment

Students take actively participate in changing their flow of activities and altering it to their outcomes and requirements. By following these components in smart learning environments, learners can efficiently obtain many specialised abilities and competencies and reach autonomous learning stage [19]. Furthermore, this approach also provides professors with an instrument for tracking, analysing, and improving the existing teaching practices. On the contrary, teachers should not dominate the learning process completely and should plan carefully with customization options.

2.6 Why Learning Analytics in SLE

Learning Analytics helps to collaborate different observations of user logs, teaching objectives, usage, and activities by learners from various sources to enhance predictive models, recommendations, and reflections [16]. There are several examples of visual LA like dashboards that help conceptualize different learning processes [20]. For instance, time utilized on assignments, utilization of the resources, assignment, and exam results [16], [21]. In [21], the knowledge and experiences for creating learning analytics were touched upon, and the information and conceptualization were discussed to provide to students and conceptualize it.

A smart class learning environment is critical for accomplishing these goals and improving the teaching and learning environment. It is equipped with advancements and

(25)

online devices, which is equally essential towards making innovative education a beneficial and informative experience for both educators and students. The best intelligent educational content relies on the learning capacities and programming ranges of abilities and their objectives adjusted for students and the teachers. It tends to be of extraordinary assistance to the students wherein they are given programming content set up on the tests and evaluation. Therefore, the central zone which needs enhancements, and more exertion is by the students. This will corroborate that learning various brilliant education skills and abilities is happening all the time. A perfect intelligent learning environment for improving programming education can be attainable by using Learning Analytics.

Programming education needs constant attention and assessment. The analysis and security of those assessments can be a good indicator of students grasping, understanding and teachers and faculty members success in making it happen [22]. Universities and higher education institutions should follow this principle “Universities should treat learning as not yet wholly solved problems and hence always in research mode.” [23]

Figure 10. Healey and Jenkins Model 2009

(26)

Healey and Jenkins model presents four ways of engagement with students and explains to keep students occupied and engrossed. Students can involve by being both actively participating or by being observers.

This model classifies the importance of methods and their usage, either be research-based content or used in the processes and issues. Both ways are interconnected and cannot be neglected. Blended learning can be an answer for it [24]. There is immense information about programming education, students in a Smart learning environment, and always be a way to grab it and do not let it go to waste. Educational data mining and learning analytics are answers for it. Learning Analytics can help to analyse, obtain, and showcase information about each student of programming education. Eventually, it provides continuous feedback during online courses. This allows the faculty members to enhance the programming courses for future students.

“Learning analytics is the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs.” [25]

Learning Analytics effectively changes the perspective towards programming education and the ways used to learn programming knowledge. LA is dependent and interconnected with the Smart Learning environment activities of students. It produces a data trail that is highly beneficial to teachers and content creators. Learning Analytics focuses and measures students’ efforts and studies their willingness and attitude towards the concepts and programming studies. It tracks the in-depth progress of the SLE. It helps the students self-assess and understand concepts' clarity by doing tasks regularly based on the data and learning abilities. LA stresses more on quality than quantity as it focuses on getting ideas and queries solved in students' minds while solving programming related problems. We can summarise Learning Analytics in SLE for Programming Education.

1. Structured and advanced course selection: The main benefit of Learning Analytics is selecting more suitable course material that suffices student learning needs for programming education and is aligned with their program study. By analysing the student participation and interests, it becomes more helpful for the universities and faculty to refine the course material to increase student enrolment

(27)

in programming education. They can also have long term goals set up for future programs.

2. Content Enhancement: By optimal utilisation of LA, it becomes more helpful in improving course material in programming.

2.7 Benefits of Learning Analytics

Learning analytics extract and measures the collection of data from learning experiences and makes a quantitative comprehension of how this information meets human choices and standards of conduct. Learning analytics can change the effect of preparing on pursuit, change the perception of teaching, and change how to do practical training thoroughly. Analytics is all about the online educational platform to provide detailed instructions and training to learners and students about their execution. For instance, learning analytics can provide learners to ponder their conduct in helpful manners to assist them in dealing with their advancement toward their learning objectives. Learning analytics can change the way we measure effects and results in learning and working environments, empowering learners to grow better approaches for accomplishing excellence in instructing and learning to provide the data of new information to create the ideal decision about their education. Educational Institutions should take this golden opportunity to work for the development and betterment of the world through proper utilization of learning analytics.

(28)

3 Methodology

3.1 Research context

Smart Learning is truly advancing, and it is made increasingly favourable and collaborative for the learning objectives, both for the students and the instructors.

Learning analytics in smart learning conditions for programming instruction is a significant severe point for securing quality education. Even though it is in a very developing stage in education, this research can help the educational institution generate Learning Analytics in SLE for programming education.

3.2 Data Collection Procedure

Here we have characterized the information collection strategy for the Literature review.

This system was utilized to make the literature review more straightforward and more exact. In addition, specific advances were being used for performing the procedure. By using these means, it would be helpful for different specialists for leading their literature reviews.

3.3 Search Databases

Information sources chosen for this literature review were IEEE Xplore, ACM, Springer, Google Scholar. Relevant articles were selected and downloaded from these databases.

Various keywords were used to extract significant articles on Learning Analytics in Smart Learning Environment for Programming Education. The audit was comprehensive and tedious.

3.4 Search Terms

Many different keywords were used to find relevant articles on Learning Analytics in Smart Learning Environment for Programming Education. This review was extensive and time consuming on the following keyword terms and their combinations.

(29)

Table 1. Publishers and articles based on search terms.

Publisher Search Terms

Google Scholar “Smart Learning Environment” AND “Learning Analytics” AND

“Programming Education”

ACM “Smart Learning Environment” OR “Integrated Learning Environment”

OR “Programming Education” OR “Learning Analytics”

Springer “Smart Learning Environment” OR "Integrated Development Environment” OR “Programming Education”

IEEE “Smart Learning Environment” OR “Integrated Learning Environment”

OR “Programming Education” OR “Learning Analytics”

3.5 Selection of articles

A plethora of articles was downloaded which applied to our literature review. All the relevant articles were analysed and extracted for the study. The sources were then examined thoroughly by skimming the conceptual content, the keywords of the articles present in the titles, introduction, and chapter 1.5. The articles were then arranged in a folder containing sub-files as the supervisor's databases to approve those articles and provide a critical review.

Table 2. Categories with inclusion and exclusion

Category Inclusion Exclusion

Time Frame Article published between 2005 to 2020

Articles that published before the time frame

Important topics

Learning Analytics in Smart Learning Environment

Not related to Learning Analytics in Smart Environment

(30)

Subject Area Programming Education Not related to Programming Education

Language English Not in English

Access Open Access Closed access

The articles can be authenticated by verifying the keywords like Smart Learning Environment, Learning Analytics, Programming in the database. They can be typed in the condition table as a general keyword to download only related articles.

3.6 Data Analysis

In this exploration, we thoroughly analysed all the articles selected in a Microsoft spreadsheet having column name Title, Publication, Year, Idea, Database. All the articles are arranged in a particular manner to analyse and figure out the relevant content. The distinctive merger of keywords such as Smart Learning Environment, Learning Analytics, Programming instruction etc., is used to conduct the research.

The most relevant analyses for the literature review (A Meta-Analysis of Empirical Research Results from 2009 to 2015) was searched and downloaded from Google Scholar by utilizing the keyword string of Learning Analytics and Smart Learning environment as the specific keywords in Google Scholar and the in the wake of going through its theoretical framework. Subsequently, it was noted down in the excel sheet by saving its Title, Year of publication, primary thought introduced in it, and the database from which it was downloaded, Google Scholar. Thus, it became simpler to evaluate a diverse range of articles in a similar manner. If needed to go through any critical articles, I would examine the excel document to learn about the article name and the relevant information for evaluating the article, which could be utilized in the thesis. Various filters could be applied in the excel files or records to sift through the article from the excel.

(31)

Figure 11. Summary of the systematic review execution process

4 Results

Various articles were analysed and reviewed to find an appropriate answer to the research questions.

4.1 Types of data collected in Smart Learning Environment for Programming Education

In this section, the kind of data collected for programming education suitable for learning analytics techniques is unravelled.

Different kinds of data help form extensive datasets that would further aid in modelling the data to answer the research questions in consideration. The types of data collected range from user actions, their activity ad engagement with the learning platform and the scope of information provided by the data. Information gathered from Smart Learning Content can be separated generally in the following ways:

Information gathered from Smart Learning Content can be separated generally in the following ways:

(32)

Compose data: Compose data is a richer level of data that provides a big picture of the available data. It tells information about the data that occurs during creation. It is all about actions and describes all the things that stakeholders do while using the software. Such type of data is collected for automation and decision making. The main advantages of using compose data are:

1. Higher fidelity: As stated earlier, this data provides a big picture of the data in use based on the actions of all stake users. This gives more reliability to this data in decision making. Also, since the owner owns the composed data, they can use it more confidence.

2. More flexibility: This is raw data captured, giving users the power to perform any type of analytics.

This data extracted consists of various activities like clipboard operations, mouse click, keystroke and working the graphical widget. It also stores information about click history, such as navigation history of students like online programming courses (login/logout), duration of the session, time spent on a specific programming discussion, number of tutorial videos watched, etc. Again, it additionally collects the date and timing of login.

Compile-Time data: It stores activities such as coding error, compilation attempts, issue started, endeavoured the case, answer finished and submitted, feedback, specific tests, course task scored and so forth. It likewise gathers data about which programming course was chosen the maximum times. It additionally collects information about action logs and so on. It can gauge the time stamps of the programming tasks.

Run-Time Data: This type of data can be used to generate run-time, logs, and exceptions depending on the student's activity. The run time data is collected from Smart Learning IDEs. In addition, this type of data contains primary data with Smart Learning Environment, e.g., time required with the Smart Learning, average execution assessment (test and number of resolved issues and problems).

Additional data: Additional data about students include name, area, training level, email id, and other significant accessible demographics. It also contains data about the enrolment status, and programming languages choose for study purposes and programming languages contemplated.

(33)

Social Data: Social data extracted for programming education can likewise include messages, remarks, and comments for a given code. Social data can also be used for recommending courses in the future and even career paths to consider based on previous knowledge. It also collects information about various forums and discussions, replies by students and tutors regarding syntax or code reviews and so forth. Tutors can also collect data regarding the duration of the session, which will eventually help measure the level of student’s engagement on a particular page.

Testing Data: Testing data is about test cases, area, training level, email id, and other significant accessible demographics. It also contains data about the enrolment status, and programming languages choose for study purposes and programming languages contemplated.

Augmented data: Augmented data can be instrumental in improving learning analytics, mining information, and comprehending the patterns of the students and video content.

Biological data: Health data is collected from Smart Learning Contents. Instructors can manage it, students by learning several steps, running distance, sleep time etc.

All types of data can be helpful for the engineers or developers to build up the Smart Learning Environment. Different valuable portfolios regarding the progress of the students and assessment of their presentations. It would be of great use for both students and their instructors.

Furthermore, we have extracted data and prepared a table for a better understanding of the exposition. The table includes data category, technique, platform type, etc.,

On the other hand, for the exploration of the social data, educational data mining, qualitative and quantitative feedback, collaboration metrics, students learning behaviour pattern, visual environment data, assessment metrics and several other techniques has been used to analyse the category of novel taxonomy, mine craft, E-Textbook System, game-based learning environment, technology-enhanced learning and so on.

(34)

Table 3. Comparison of IDEs used in Programming Education

Data Category Type of data

Eclipse Hackystat Netbeans Visual Studio Marmoset IntelliJ IDEA Blackbox Terms Web cat BlueJ

Compose data Text

Mouse click

Keystroke

Images

Clipboard Operations

Compile- time data Coding Error

Compilation Attempts

Social Data Messages

Remarks/Comments

Run-time data

Run-Time Error

Logs

Exceptions

Testing data Test Results

Tests Execution

Additional data

Documents

Code Comments

Inspect data

Debug data

Sensor’s data

Aug ment ed data Location data

(Coordinates)

Biological data

Eye Tracking Heart Rate Mouse Pressure Keyboard Pressure

Data collection services have been increasingly combined into Eclipse, BlueJ [26] and Hackystat [27]. Eclipse can be utilised for collection of testing data on student testing exercises & execution. In addition, eclipse contains computerised testing tools such as Web-CAT [28].

They function with the Eclipse IDE: the DevEventTracker plugin with WebCat [29], the Marmoset plugin

[8]

and the open-source HackyStat [27]. Through Web-API Eclipse, collect data and analytical framework. HackyStat and Marmoset can plugin into Visual Studio [30] and Netbeans [31].

IntelliJ handles smart code completion, which is an integral part of the application. IntelliJ also collects MP3 files and Java database files by the GUI toolbar

[32]

. Blackbox

(35)

observes and records the programming behaviour of students or new learners and collect programming activity data [33].

Blackbox collects data from worldwide users of the BlueJ IDE and established computing education research techniques [46]-[49]. The data is compiled into a MySQL database on a single machine. Blackbox allows researchers to develop unique ID numbers, which contain a link, text data and demographic data about participants [34].

4.2 The data from IDE and sensors which can be used to find innovative methods for improving student learning abilities in programming education

There is a specific limitation in IDE methods for collecting information. For instance, while a student edits the code while working on an assignment, the edited information can be lost. Student attempts to execute his code is equally important as the dry runs done by the student by writing pseudocode. Hence certain limitations of IDE must be kept in mind while designing Learning Analytics.

By extracting helpful information from IDE, we can get valuable insights on students learning and progress, which is beneficial for better design the programming modules.

Here present few methods which can be extracted from the available raw information.

The table below shows the taxonomy of helpful information from IDE for improving students' learning in which platform and technique are central.

Total: Information can be extracted by counting the total number of raw data.

Furthermore, the selection of information is based on specific parameters, e.g., time limits of the task, data points, person doing the task.

Analytical: Numerical formula is always a wise methodology for detecting and extracting vital insights from the data. E.g., average, or median of data by the time taken to complete the task and calculating the percentage of data points.

(36)

Algorithm: Algorithms is a step-by-step presentation of a given problem and is used as data transformation for valuable insights—the analysis breakdown into the smaller piece by using an algorithm that helps match the output with prediction.

Visualization: Data Visualization offers prized insights which are impossible to get by simply examining the unprocessed data points. Visualization helps decision making and interpret the data.

Machine Learning: Machine Learning identifies trends and patterns, which helps to ascertain from, analyze different ways and is helpful to make data forecasts.

Simple student learning and development are present by simple data types, as seen in the table mentioned above. The content is the foundation to improve their programming education skills. To enhance learner progress, constant encouragement and constant motivation should also keep students working hard. Another vital point that needs to keep in mind is the intervention's presentation to the programming student. IDE presentation provides different forms such as graphs and charts to the learners in a smart learning environment, which help show progress and feedback graphically.

The critical part is to ensure the effectiveness and level of engagement with programming education and student. For instance, visualizations should be designed in such a way that present data cover student engagement like immediate tasks.

(37)

Table 4. Categorical data category, technique, and occurrences

Data

Category Technique Type of Analysis

Programming Management Spend time on task Total

The time between starting the assignment and the deadline Total

Code changes Total

Number of executions Analytical

Number of written test cases Total

Code coverage Algorithm

Awareness Number of inquiries addressed effectively and inaccurately Analytical

Student’s log Visualization

Students' learning behaviour patterns Visualization

Social Data

Count of the created post Total

Count of asked question Total

Count of answered questions Total

Count of unanswered questions Total

Count of earned marks Total

Count of remarks Total

Count of messages (Self-report emotions) Total

Code efficiency Program execution time Total

The result of test cases Total

Source code plagiarism Algorithm

Coding with comment style Algorithm

Code size in the program, including function Total

Physiological Response info Human’s mood detection Algorithm

Pleasure, unpleasure and anxious feelings Algorithm

Eye movement Visualization

The taxonomy contains few data categories corresponding with a different type of analysis.

Alerts interrupt the learning of the student as a pop message. E.g., a pop-up message in IDE guides steps to students. Firstly, it is a text message, as opposite to a visual representation. Secondly, it interrupts the learner within the IDE. Constrains are helpful to prevent the student from performing any illegal activity. Denies in IDE worked as controllers for different tasks in programming. In IDE, it reminds programmers to execute their code while students didn’t compile a program for a long time by popping up messages about compilation. Time reminders in IDE encourage and guide for a different action plan. The final vital method is an intervention which designs for the user through IDE. It is helpful for the user learning curve after creating it appropriately.

(38)

4.3 The methodologies in Learning Analytics that support knowledge extraction from data available in Smart Learning Environment

As we have discussed in section 4.2, most valuable data from IDE and sensors for improving student learning abilities in programming education, predictions also support knowledge extraction in such a way to increase the accuracy of learning to the likely outcomes of a question based on historical data. After a systematic literature review, we can see different techniques and methodologies that are beneficial for inferring knowledge from programming learning data in smart learning environments.

The first methodology is to update learners on their progress by notification which encourages learners to improve and present a concrete suggestion not to give up. Another promising methodology is Learning Analytics (LA) in the smart learning environment.

Yet, advanced education collaborator needs to be further acquainted with issues identified with the utilization of Learning Analytics in advanced studies. It is multidisciplinary based on data processing, technology-learning enhancement, educational data mining, and visualization [35].

(39)

Figure 12. Taxonomy of design methodology in Learning Analytics

Timing helps to predict the behaviour of students and analyze the events. The schedule identifies and predicts patterns in student behaviour and performance. Predictive analytics (PA) gathers historical data and help educators to predict the outcomes of education. The past decade has seen an expanding number of publications examining instructive information and giving significant bits of knowledge into education and the learning environment. Learning analytics (LA) is acquiring consideration as an emerging research area.

Design Me thodology

Notification Inspiration Data Push Suggestion

Visualization

Necessity Demonstration

Submission

Schedule

Engagement

Event Timing

State-Based

Persevering

(40)

When students are at risk of poor performance, learning analytics provide educators with information on the overall development of students. As a result, the educator can implement different strategies that help the student adapt to their current environment.

Machine learning estimates the quality of information about students and their contexts for improving education. For instance, ML outlines learning data in a dashboard for students to ponder their exercises, commitment, and progress and for instructors to consider their educating practice and settle on choices about essential mediations. Data Science is characterized as learning in intuitive, insightful, and customized environments, upheld by sophisticated digital technologies and administrations (e.g., setting mindfulness, increased reality, social engineering, and networking).

As a data-driven methodology for better arrangement and advancing learning and the learning environment, learning analytics undoubtedly can contribute to innovative education. Human-computer interaction identifies with information of assessment, analysis, improving learning environments, and ethical issues.

The calculation of performance reports helps educators, developers, and executives about strategies, challenges, and benefits of Learning Analytics.

5 Discussion

This literature review contains three questions; the next segment would respond to those research questions.

5.1 RQ 1. What kind of data should be collected in the Smart Learning Environment for assisting programming education?

We can now discuss data types collected in the Smart Learning Environment to improve programming education. By analysing research question 4.1, we can see data can be classified broadly into the following categories: Compose data, Compile-time data, social data, Run time data, biological data, Testing data, Augmented Data, and Additional data.

These data categories are collected through the entire programming process.

(41)

Compile-time Data consists of Coding errors, Compilation attempts that occur while performing the programming tasks assigned to the students. Social Data consists of messages, remarks, comments, and feedback provided for the code. It also includes notifications about upcoming courses, career options in programming, forum debate information, student engagement sessions. Run time data is visible from the table in RQ1.

We can see that it contains run time errors, logs, exceptions, which can be tracked during student’s programming—testing data which consists of test results and test execution.

Additional data collect information like document, code comments, inspect data, debug data etc. and other demographical information. Augmented Data collects student location data. Biological data contains data like Eye tracking, Heart rate, mouse pressure, keyboard pressure by reviewing other literature related to data collected in SLE related to programming education were extracted. A lot of emphasis was provided to articles that were related to Programming Education. It found that most of the relevant articles had mentioned student demographics and student programming data, which includes different editing data, compilation level data, execution level data, and debugging data. It also consists of student’s event data such as keystrokes, logins, open &save, code screenshots.

Several times spent in SLE, password change, programming language preference, social data which can have social interaction data such as surveys, quizzes, questions asked in the surveys, and their responses, forum post likes, audience, forum post replies, messages, number, and members of different assignment groups etc. We can also accumulate data about the professor’s feedback, assignment marks, professor’s interaction with the student, overall grades etc.

5.2 RQ 2. In which way can way can the data from IDE and sensors be worked for extracting information and to find innovative methods for improving student learning abilities in programming education.

After going through other literature, which was suitable for answering this research question, we understood methodologies for effective data collection and different models

(42)

that can be used for it. It discussed different data types such as Standard data and augmented data. It was observed that there are different IDE used for data collections, and each IDE varies in the style and methodology in which data can be collected. Some IDEs collect editing data. Some may gather at a keystroke or when the code is entirely compiled. As a result, it becomes incredibly tedious to perform any studies among the IDE. It would be highly recommended to have a standardised data format for IDE to study different learning initiatives undertaken by different IDE. A universal repository or a directory could be maintained for IDE log data. This could be public and easily accessible by other universities and researchers, which would help conduct IDE research work that can help provide innovative methods for improving student learning in programming education. Standard API can be built to strengthen IDE architecture, which could help data collection and innovation in programming education. Collaboration of different programmers would be required for it. An eventually broader spectrum of programming methods, IDE impact could be studied, which would help advance programming education.

Along with programming processes data, even augmented data such as quizzes, survey data must be collected. It would be beneficial for designing better predictive models of performance and better innovative programming modules by analysing different interconnections between different IDE data types. By addressing the broader spectrum, would not only help programming students with their learning skills but would also be receptive to their requirements.

As we can see from the table in section 4.2 research question, there are different ways through which data from an IDE and sensors can be analyzed for knowledge discovery and decision making regarding improved student learning in programming education.

Total can be used in Programming management for productive time disbursed on the task, code alteration, and any changes made to the actual programming code and test cases.

Analytical can be used for counting executions completed, and algorithms can be utilized for code reportage. For Awareness, Analytical can be used for several inquiries, and Visualization can be used for student logs, learning forms. For social data, Total can count the number of posts asked and answered, unanswered questions, earned marks and feedback. For code efficiency, Total can be used for program execution time taken, the output of test cases, code size, Algorithm can be used for checking any imitation, Coding

(43)

comment form. For Physiological Response Info Algorithm can be used for Human mood detection, feelings detection, and Visualization to track eye movement.

5.3 RQ3. What are the methodologies in Learning Analytics to

extract knowledge from data available in smart learning environment?

As discussed from research question in section 4.3, we can see there are learning analytics techniques to infer knowledge from programming learning data in smart learning environments: Notifications, Visualisation, and Schedule.

The notification provides an analysis of what must be presented. It would inspire by encouraging them to not give up and improve further, apprising the learners about their procedures and growth by offering them data types. It can act as a source of inspiration, push, or boost for further improvement and keep working hard, and one could also provide suggestions or proposals. Visualisation, as the name suggests, is about ways of presenting the information to the learner. IDE based learning environment has three different techniques for extracting knowledge and submitting it. The demonstration could be helpful in effective pictorial analytics in a learning environment. A significant consideration is that they are pretty efficient in encouraging learning only if the learner engagement is high, e.g., there should be some sought of collaboration in the demonstrations in the demonstration [35].

Human-system interactions & person opinions can too enhance the efficacy of demonstrations. First, the report is quite different from demonstration delivery as it provides text messages, and secondly, it provides a reminder in the form of a popup message in an IDE. The schedule can be used to determine when the information must be presented to the learner. One option would be persevering, which would be continuously available. The state-based technique can help when the user is known to change over into or remain within an educationally fruitless state. On-Demand would be helpful only after the user request. For e.g., an IDE could deliver a “Get Hint” button when clicked, would create a recommendation or analysis to the learner. This type of schedule varies to some extent from the Persevering schedule option. Although it is always available, it is not noticeable to the learner unless it appeals to it.

Viittaukset

LIITTYVÄT TIEDOSTOT

Evaluation Feedback on the Functionality of a Mobile Education Tool for Innovative Teaching and Learning in Higher Education Institution in Tanzania, International Journal

This study is the first to conduct a bibliometric analysis of the field with a specific objective to examine the trend of smart learning environments over time; in- vestigate the

This research paper presents Imikode, a virtual reality (VR)–based learning game to support the teaching and learning of object- oriented programming (OOP) concepts in

A vocational education curriculum is a mix of everything involved in learning: goals, content, results, evaluation, learning environments, timetables, the integration of the

The literature analysis on learning the concepts of nanoscale science (Fig. 1) involved science education research literature on teaching and learning the nanoscale

Besides, from the author’s experience, computer science students in Nigeria can barely afford a laptop to hone their programming skills; these students, however, own

This research paper presents Imikode, a virtual reality (VR)–based learning game to support the teaching and learning of object- oriented programming (OOP) concepts in

For the first research question, the literature review identified factors such as intention to- wards expected behavior, social learning, perceived risk, security environment