Bodong Chen

Crisscross Landscapes

Notes: Kinnebrew2013-A Contextualized, Differential Sequence Mining Method to Derive Students’ Learning Behavior Patterns



Citekey: @Kinnebrew2013-de

Kinnebrew, J. S., Loretz, K. M., & Biswas, G. (2013). A Contextualized, Differential Sequence Mining Method to Derive Students’ Learning Behavior Patterns. JEDM - Journal of Educational Data Mining, 5(1), 190–219.



The core algorithm employs a novel combination of sequence mining techniques to identify di erentially frequent patterns between groups of students (e.g., experimental versus control conditions or high versus low performers). We extend this technique by contextualizing the sequence mining with information about the stu- dent’s performance over the course of the learning interactions. Speci cally, we employ a piecewise linear segmentation algorithm in concert with the di erential sequence mining technique to iden- tify and compare segments of students’ productive and unproductive learning behaviors (p. 1)

Cognitive scientists have established that metacognition and self-regulation are important components for developing e ective learning in the classroom and be- yond [Bransford et al. 2000; Zimmerman 2001]. In developing a computer-based learning environment (CBLE) called Betty’s Brain, we adopt the framework for self-regulated learning (SRL), which originated from the social constructive theory of learning proposed by Bandura [1997]. (p. 1)

We believe choice-rich, open- ended learning environments like Betty’s Brain, where students have to work on complex learning tasks can provide these much-needed opportunities to learn and practice these skills in classroom environments. (p. 2)

In previous work, we have used hidden Markov models (HMMs) [Rabiner 1989] as a direct (probabilistic) representation of these internal states and strategies. This methodology has facilitated identi cation, interpretation, and comparison of student learning behaviors at an aggregate level [Biswas et al. 2010; Kinnebrew et al. 2013]. Like a student’s mental processes, the states of an HMM are hidden, meaning that they cannot be directly observed, but they produce observable output (e.g., actions in a learning environment). However, the aggregated descriptions often mask the exact manifestation of speci c learning behaviors and strategies that students employ as they work on the system. To identify and analyze these detailed learning behaviors, we have developed sequence mining methods that provide a ner-grained analysis and more precise de nitions of learning behavior di erences between groups of students. (p. 2)

sequential pattern mining [Agrawal and Srikant 1995 (p. 2)

However, this approach can also result in a very large number of frequent patterns, especially when allowing gaps to ac- count for \noise.” For example, when we looked at activity traces of 22 8th grade students working on a climate change science unit in Betty’s Brain, we found over 1,000 patterns that occurred in at least 80% of their interaction traces on the sys- tem, even when we limited gaps to a single action between consecutive actions in the pattern. (p. 3)

In general, a major challenge in the analysis of frequent patterns is limiting the often large set of results to interesting or important patterns (i.e., the effectiveness of a mining technique in identifying useful patterns as opposed to its efficiency, or speed, in finding the patterns [Agrawal et al. 1993]). (p. 3)

To overcome this problem, we have developed an algorithm that employs a novel combination of sequence mining techniques to identify di erentially frequent pat- terns between groups of students (e.g., experimental versus control conditions or high versus low performers) [Kinnebrew and Biswas 2011]. (p. 3)


2.1 Identifying and Measuring Metacognition and Self-Regulated Learning (p. 4)

The traditional approach to measuring students’ metacognition and self-regulated learning (SRL) has been through the use of self-report questionnaires (e.g., [Pin- trich et al. 1993; Weinstein et al. 1987; Zimmerman and Martinez-Pons 1986]). (p. 4)

Increasingly, researchers have begun to utilize trace methodologies in order to examine the complex temporal patterns of self-regulated learning [Aleven et al. 2006; Azevedo and Witherspoon 2009; 2009; Biswas et al. 2010; Hadwin et al. 2007; Jeong and Biswas 2008; Zimmerman 2008]. The primary notion is that metacog- nition and SRL are evolving events that can be better understood by observing students’ learning and problem-solving activities. (p. 5)

For example, Hadwin et al. [2007] performed a study that collected activity traces of 8 students using the gStudy system [Perry and Winne 2006]. The activity traces were analyzed in four di erent ways: (i) frequency of studying events, (ii) patterns of studying activity, (iii) timing and sequencing of events, and (iv) content analyses of students’ notes and summaries. (p. 5)

Perry & Winne, 2006 (p. 5)

More recently, trace data is being supplemented with other sources of data, such as concurrent verbal think-alouds (e.g., [Azevedo and Witherspoon 2009] and mea- sures of a ect (e.g., automatic recording of facial expression, posture, etc. [Burleson et al. 2004; D’Mello et al. 2007; D’Mello et al. 2008; Lester et al. 1997]. (p. 5)

2.2 Applying Sequence Mining to Study Student Learning Behaviors (p. 6)

Perera et al. [2009] investigate trace data from mirroring and feedback tools that support e ective teamwork among students collaborating on software development using an open source professional development environment called TRAC. (p. 6)

Martinez et al. [2011] discover which frequent sequences of actions char- acterize high-achieving and low-achieving learners. (p. 6)

Like Perera et al. [2009] and Martinez et al. [2011], we compare sequential pat- terns derived from groups of student activity sequences. However, our di eren- tial sequence mining algorithm directly incorporates comparisons between groups in identifying interesting patterns, rather than manually performing researcher- directed comparisons after data mining. Further, in addition to sequential pattern mining metrics and constraints, we employ additional metrics, such as another frequency calculation based on episode mining, to identify di erentially frequent patterns. (p. 6)

Other researchers have employed sequential pattern mining (with a single set of student activity sequences or subsequences) to generate student models for cus- tomizing learning to individual students. For example, Tang and McCalla [2002] propose sequential pattern mining of students’ paths in a web-based environment, followed by a clustering step, to build better student models. (p. 6)

Some researchers have also employed sequence mining with the goal of identify- ing learning behaviors relevant to self-regulated learning. In particular, Nesbit et al. [2007] employ sequential pattern mining to investigate self-regulation in gStudy (p. 7)

The authors de ne a set of actions from events in log les and use sequential pattern mining to nd the longest common subsequences across a set of action les. Using this method, the authors hope to step beyond the question of whether a tool helps learners construct knowledge and instead in- vestigate when and how learners use the tool as they self-regulate their knowledge construction activities. (p. 7)


3.1 Action Abstraction with Context Summarization (p. 7)

Action abstraction is the rst step of our data mining methodology, in which researcher-identi ed categories of actions de ne an initial alphabet (set of action symbols) for the sequences. This step lters out irrelevant information (e.g., cur- sor position) and combines qualitatively similar actions (e.g., querying an agent through di erent interfaces or about di erent concepts in a given topic). (p. 7)

A form of feature engineering? (p. 7)

First, log events are mapped to a sequence of canonical actions (p. 7)

o maintain a balance between minimizing the number of distinct actions in the sequences and keeping potentially important context information, we determine whether the content/object of an action is the same as the content/object of any other action in a small, con gurable window of previous actions. Based on this relevance summarization we split each categorized action into two distinct actions: (1) relevant (with the -REL” suffix) and (2) irrelevant (with the -IRR” suffix) to recent actions [Biswas et al. 2010]. (p. 8)

To improve this exploratory analysis, our action abstraction step distin- guishes a single action from repeated actions (with a given threshold on the number of repeats), which are condensed to a single \action” with the -MULT” identi er. Using the re-transformed sequences, our di erential sequence mining technique can more efficiently identify trends that could otherwise be hidden by the multitude of frequent patterns di ering only by the length of a repeated action sequence [Kin- nebrew and Biswas 2011]. (p. 8)

3.2 Di erential Sequence Mining (p. 8)

our methodology employs a novel combination of sequence mining techniques. In particular, sequential pattern mining [Agrawal and Srikant 1995] is used to determine the most frequent action patterns across a set of action se- quences, while episode mining [Mannila et al. 1997] metrics are used to nd the most frequently used action patterns within a given sequence. (p. 8)

The sequential pattern mining frequency measure, which is important for identifying patterns common to a group of action sequences, is de ned as: the number of sequences in which the pattern occurs (regardless of how frequently it occurs). We refer to this frequency measure as the sequence support (or s-support) of the pattern, following the convention of Lo et al. [2008], and we call patterns meeting a given s-support threshold, s-frequent. (p. 9)

Another important metric for comparing patterns across groups is the episode frequency, which is de ned (for a single sequence) as: the number of times the pattern occurs, without overlap, in a sequence. For a given sequence, we refer to this frequency measure as the instance support (or i-support), following Lo et al. [2008]. We de ne the i-support of a pattern for a group of sequences as: the mean of the pattern’s i-support values across all sequences in the group. (p. 9)

Algorithm 1 de nes our di erential sequence mining technique, which combines the group s-support and i-support frequency measures to identify di erentially fre- quent patterns across two groups of action sequences (p. 9)

Further, we employ a sequential pattern mining algorithm that allows regular ex- pression constraints (the regex parameter) on the matched patterns. Speci cally, we use the core algorithm (SPAMc) from Pex-SPAM [Ho et al. 2005], which extends the fast SPAM algorithm [Ayres et al. 2002] with gap and regular expression con- straints. (p. 11)

In order to identify patterns whose usage more clearly di er between the two groups, we lter the s-frequent patterns based on the p value of a t-test comparing pattern i-support between the groups (in line 6). Although this approach relies on multiple comparisons by the t-test statistic between the groups, the t-test is not used to prove that the two groups of sequences di er. Rather, it is employed as a heuristic for identifying more interesting patterns in an exploratory analysis. (p. 11)

3.3 Performance Evolution Phase Identi cation (p. 12)

In many CBLEs, such as the Betty’s Brain environment described in Section 4, a student’s work can be assessed in terms of performance on a learning or problem-solving task. (p. 12)

Di erent patterns of performance evolution may be important in situating learning behaviors with respect to the student’s cognitive and metacognitive activities. (p. 12)

To identify these performance phases, we employ a bottom-up, time-series linear segmentation algorithm [Keogh et al. 2004]. This method begins with the nest- grained PLR in which each pair of consecutive points is a separate segment. At each step every pair of consecutive segments is considered for merging into a new segment, and the pair with the lowest error metric (for the merged segment) is merged. (p. 13)

n our technique, we employ a standard linear regression to generate each segment and use sum-squared-error (SSE) as the error metric. (p. 13)

Our complete methodology consists of four major steps to compare learning be- haviors between productive phases of performance and counter-productive phases for a set of students: (1) Action abstraction: Log les are processed to produce a sequence of actions for each student by mapping sets of interaction events to canonical actions. Each canonical action is split into high (-REL” suffix) and low (-IRR” suffix) relevance types based on its content relation to recent, previous actions. Finally, any subsequences of a repeated action are condensed into a single repeated action (-MULT” suffix). (2) Performance phase identi cation: Student action sequences are split into sub- sequences using the time-series segmentation algorithm. These subsequences are ltered to produce two sequential datasets: a) productive action sequences corresponding to segments with a positive progress slope above a given cuto , and b) counter-productive action sequences corresponding to segments with a negative progress slope below a given (negative) cuto . (3) Di erential sequence mining: The productive and counter-productive datasets are compared to identify di erentially frequent sequential patterns of action. (4) Interpretation: The di erentially frequent sequential patterns of action are in- terpreted in terms of e ective and ine ective learning behaviors exhibited by students during the learning task. Investigation of pattern details (i.e., raw event details for instances of these patterns) may yield further insights into student cognition and metacognition, as well as potential ags and triggers for adaptive feedback/sca olding in the system. (p. 14)

The cognitive and metacognitive activity model captures the complexity of the Betty’s Brain task, and also highlights the wide array of cognitive tasks that stu- dents should attend to as they are both constructing and monitoring their under- standing of the science domain. (p. 19)

  1. METHODS (p. 19)

To analyze the interaction traces from this study, we abstract student activities into actions in ve categories: (1) READ: students access one of the pages in the resources; (2) Editing: students edit the causal map, with actions further divided by whether they operated on a causal link (\LINK”) or concept (\CONC”) and by whether the action was an addition (\ADD”), removal (\REM”), or modi cation (\CHG”), e.g., LINKREM or CONCADD; (3) QUER: students use a template, illustrated in Figure 2, to check their teaching by querying Betty, and she answers using causal reasoning through chains of links [Leelawong and Biswas 2008]; (4) EXPL: students probe Betty’s reasoning by asking her to explain her answer to a query, which she does through a demonstration of her causal reasoning process with dialogue and animation on the causal map; (5) QUIZ: students assess how well they have taught Betty by having her take a quiz, which is a set of questions chosen and graded by the Mentor agent. (p. 20)

Categorized student actions were further distinguished by their relevance to re- cent actions, as explained in Section 3.1. (p. 20)

This step is key – feature engineering to decide relevancy. (p. 20)

For this analysis, an action was considered relevant (-REL”) if it was related to at least one of the previous actions within a 3-action window, and irrelevant (-IRR”) otherwise. For Betty’s Brain, a prior action is considered relevant to the current action if it is related to, or operates on, one of the same map links (or one of the same con- cepts if one of the actions being compared only relates to a single concept). This relevance summarization provides an indication of informedness for map build- ing/re nement actions and of diagnosticity for map monitoring activities [Biswas et al. 2010]. Overall, the relevance summarization maintains some action context by providing a rough measure of strategy focus or consistency over a sequence of actions. (p. 20)

  1. RESULTS (p. 20)

16 of the students taught their agent a correct, com- plete map or one very close to it (these students achieved map scores between 11 and 15, inclusive, where 15 was the maximum possible score). Another 18 students taught their agents relatively poor maps with a map score of 5 or below. Only 6 students had a map score in between these groups (i.e., a map score of 6 to 10, inclusive). Therefore, we focus on an analysis and comparison of the learning ac- tivities of the high-performing (\Hi”) student group and the low-performing (\Lo”) student group. (p. 21)

To identify and compare learning behaviors illustrated by these students’ inter- action traces, we applied the di erential sequence mining technique described in Section 3.2 (p. 22)

to identify a variety of interesting learning behaviors that were not apparent from separately considering s-frequent or i-frequent patterns. To illustrate these results, Table III presents the top ve patterns (that contained at least two actions) in each of the di erential categories detailed in Section 3.2. In this analysis, we employed an s-support threshold of 50% to analyze patterns that were evident in the majority of either group of students and employed a cuto of p < 0:05 for a reasonable con - dence that pattern usage di ered between the two groups. (p. 22)

The Hi group’s di erentially frequent patterns in Table III suggest a greater tendency to check low-relevance additions of causal links (possibly links added based on prior knowledge or guesses) by taking quizzes or by reading relevant resources, as described in the progress-monitoring strategies in Section 4.2. (p. 22)

To gain further insight into the most productive learning behaviors exhibited by the students in the Betty’s Brain environment, we employed the di erential sequence mining technique to identify patterns di erentiating the productive sub- sequences of the Hi and Lo groups. Speci cally, all subsequences corresponding to productive segments in the Hi group were compared with all subsequences cor- responding to productive segments in the Lo group, with the results presented in Table V. (p. 25)

Pintrich, P., Smith, D., Garcia, T., and McKeachie, W. 1993. Reliability and predictive validity of the Motivated Strategies for Learning Questionnaire (MSLQ). Educational and psychological measurement 53, 3, 801{813. (p. 29)