read

References

Citekey: @Segedy2015-sy

Segedy, J. R., Kinnebrew, J. S., & Biswas, G. (2015). Using Coherence Analysis to Characterize Self-Regulated Learning Behaviours in Open-Ended Learning Environments. Journal of Learning Analytics, 2(1), 13–48.

Notes

Nice work! So this work follows this ‘logic’: SRL is important and we have good theoretical models based on decades of research. We have Betty’s Brain as an ‘open-ended learning environment’ (note that’s different from open learning environments), where SRL is key for learner success. Coherence Analysis now kicks in as a novel approach to assessing SRL, based in a SRL model and situated in Betty’s Brain; CA derives a bunch of indices that are predictive of learning performance measured by knowledge tests in this study. These indices are then used to cluster students. Sequence mining is then applied to further interpret different clusters (incorporating temporal information extracted from log analysis).

It is rare to have a strong theoretically-driven approach + clear articulation of assumptions + clear operationalization of metrics in learning analytics research. This paper is one of these exceptions.

That said, wondering how easy it would be for researchers to apply the CA indices in a different context. Also more generally wondering:

  • when would LA research escape using knowledge tests as the benchmarks of learning, and
  • furthermore, are there other ways to investigate in this context, instead of following the “indices-extraction”–clustering–“contrasting-clusters” approach? (I’m in lack of good terms to describe this approach.)

Highlights

we present our work in developing and evaluating coherence analysis (CA), a novel approach to interpreting students’ learning behaviours in OELEs. (p. 1)

SRL is an active theory of learning that describes how learners are able to set goals, create plans for achieving those goals, continually monitor their progress, and revise their plans when necessary. SRL is a multi‐faceted construct: it involves emotional and behavioural control, management of one’s learning environment and cognitive resources, perseverance in the face of difficulties, and social interactions to promote effective learning (Zimmerman & Schunk, 2011). (p. 1)

open‐ended computer‐based learning environments (OELEs; Clarebout & Elen, 2008; Land, Hannafin, & Oliver, 2012), which provide a learning context and a set of tools for exploring, hypothesizing, and building solutions to authentic and complex problems. (p. 2)

To validate our approach, we applied CA to data from a recent classroom study with the Betty’s Brain OELE (Kinnebrew, Segedy, & Biswas, 2014; Leelawong & Biswas, 2008). Results demonstrate the effectiveness of CA in 1) predicting students’ task performance and learning gains and 2) identifying common problem‐solving approaches among the students in the study. Further, the results demonstrate relationships between CA‐derived metrics and students’ prior skill levels, offering a potential explanation for students’ problem‐solving behaviours. Taken together, these results provide insight into students’ SRL processes and suggest targets for adaptive scaffolds to support students’ development of open‐ended problem‐solving skills. (p. 2)

While OELEs may vary in the particular sets of tools they provide, they often include tools for 1) seeking and acquiring information, 2) applying that information to a problem‐solving task, and 3) assessing the quality of the constructed solution. (p. 3)

Roscoe, Segedy, Sulcer, Jeong, & Biswas describe SRL as containing “multiple and recursive stages incorporating cognitive and metacognitive strategies” (2013, p. 286). Their description of SRL is summarized in Figure 1; it presents SRL as involving phases of orientation and planning, enactment and learning, and reflection and self‐assessment. (p. 4)

Figure 1. Amodel of SRL according to the descriptionin Roscoe et al. (2013) (p. 4)

Measuring students’ self‐regulation and metacognitive behaviour in real time is a difficult task; it requires developing systematic analysis techniques for detecting aspects of goal setting, planning, monitoring, and reflection in the context of the learning environment. (p. 4)

Despite this complexity, researchers have developed several approaches to measuring aspects of self‐ regulation and metacognition in OELEs. For example, MetaTutor (Bouchet et al., 2013) adopts a very direct approach; it provides interface features through which students can externalize their SRL processes. (p. 5)

Another approach employed in several OELEs involves developing a predictive, data‐driven model for diagnosing constructs related to SRL in real time (e.g., engagement, frustration, or confusion). (p. 5)

In other OELEs, researchers have developed theory‐driven models of SRL and embedded those models into learning environments. For example, EcoLab (Luckin & du Boulay, 1999; Luckin & Hammerton, 2002) measures students’ metacognitive awareness of their own ability by comparing the system’s assessment of students’ ability levels with the difficulty of the activities they choose to pursue. (p. 5)

coherence analysis (CA), and applied that approach to the interpretation of log data from Betty’s Brain. However, rather than focusing on a specific aspect of SRL (e.g., awareness of one’s own ability), CA focuses on students’ ability to seek for, interpret, and apply information encountered while working in the OELE. In doing so, CA models aspects of students’ problem‐solving skills and metacognitive abilities simultaneously. (p. 6)

4 COHERENCE ANALYSIS (p. 7)

To develop this approach, we first performed a task‐driven analysis of Betty’s Brain (similar to cognitive task analysis; Chipman, Schraagen, & Shalin, 2000) to derive 1) the primary tasks that students should be able to complete to succeed in an OELE, and 2) the processes students must execute to complete those tasks. (p. 7)

4.1 Cognitive and Metacognitive Problem-solving Processes in OELEs (p. 8)

These tasks and their more specific implementations in Betty’s Brain have been incorporated into atask model (Figure 3) that specifies the tasks important for achieving success in Betty’s Brain. The highest level of the model identifies the three broad classes of OELE tasks related to 1) information seeking and acquisition, 2) solution construction and refinement, and 3) solution assessment. Each of these task categories is further broken down into three levels that represent 1) general task descriptions common across all OELEs (according to the definition of OELE discussed above); 2) Betty’s Brain specific instantiations of these tasks; and 3) interface features in Betty’s Brain through which students can accomplish their tasks. (p. 8)

The directed links in the task model represent dependency relations. (p. 8)

a great example of theory-driven analytics. (p. 8)

Identifying and evaluating the relevance of information describes the processes students employ as they observe, operate on, and make sense of the information presented in an OELE’s information acquisition tools (Land, 2000; Quintana et al., 2004). Productively employing these processes requires an understanding of how to identify critical information and interpret it correctly. (p. 8)

The task model, along with the model of SRL presented in Figure 1, identifies and draws connections among the cognitive and metacognitive processes critical for learning in OELEs. Students need to leverage their metacognitive knowledge and task understanding in order to select intermediate goals for completing their tasks and then create plans for coordinating their use of skills and strategies in service of achieving those goals. Creating these plans requires understanding the purposes of, and relationships among, the tasks identified in the task model. Effective plans will utilize information gained from both information acquisition and solution assessment activities in order to build and refine a causal map that more closely approximates the expected solution. Because students are likely to make mistakes in constructing their solutions, they need to understand how to utilize the results of solution assessments to direct their thinking as they reflect on the sources of their errors. (p. 10)

4.2 Modelling Learner Behaviours with Coherence Analysis (p. 10)

The Coherence Analysis (CA) approach analyzes learners’ behaviours by combining information from sequences of student actions to produce measures of action coherence. (p. 10)

When students take actions that put them into contact with information that can help them improve their current solution, they have generated potential that should motivate future actions. The assumption is that if students can recognize relevant information in the resources and quiz results, then they should act on that information. If they do not act on information that they encountered previously, CA assumes that they did not recognize or understand the relevance of that information. (p. 11)

These two notions come together in the definition of action coherence: Two ordered actions (ݕ→ݔ ) taken by a student in an OELE are action coherent if the second action, ݕ , is based on information generated by the first action, ݔ . In this case, ݔ provides support for ݕ , and ݕ is supported by ݔ . Should a learner execute ݔ without subsequently executing ݕ , the learner has created unused potential in relation to ݕ . Note that actions ݔ and ݕ need not be consecutive. (p. 11)

The task model (Figure 3) implies two critical coherence relations: 1) applying information acquired from the hypertext resources to editing the map; and 2) applying inferred link correctness information (as obtained via quizzes) to editing the map. More specifically, an information seeking action (e.g., reading about a causal relationship) may generate support for a future solution construction action (e.g., adding that causal relationship to the map). Similarly, a solution construction action can be supported by a solution assessment action. (p. 11)

CA assumes that learners with higher levels of action coherence possess stronger metacognitive knowledge and task understanding. Thus, these learners will perform a larger proportion of supported actions and take advantage of a larger proportion of the potential that their actions generate. (p. 11)

Action coherence metrics measure whether or not learners’ actions take advantage of previously encountered information. To measure whether or not a learner’s actions contradict the information generated during previous activities, CA also incorporates measures of action incoherence: Two ordered actions (ݕ→ݔ ) taken by a student in an OELE are action incoherent if the second action, ݕ , is action coherent with the negation of information generated by the first action, ݔ . In this case, ݔ provides negative support for ݕ , and ݕ is contradicted by ݔ . (p. 12)

CA assumes that learners with higher levels of incoherence among their actions possess a weaker understanding of the science domain and the relations between different concepts in the domain. (p. 12)

Our hypotheses in developing CA were as follows: 1. Students’ CA‐derived metrics would predict their learning and success in teaching Betty; 2. Students’ prior levels of task understanding would predict their CA‐derived metrics while using Betty’s Brain. (p. 12)

5 EXPERIMENTAL STUDY WITH BETTY’S BRAIN (p. 13)

Ninety‐nine 6th grade students from four mid‐Tennessee science classrooms participated in the study. (p. 13)

Students used Betty’s Brain to learn about human thermoregulation when exposed to cold temperatures. The expert map, shown in Figure 4, contained 13 concepts and 15 links representing cold detection (cold temperatures, heat loss, body temperature, cold detection, hypothalamus response) and three bodily responses to cold: goose bumps (skin contraction, raised skin hairs, warm air near skin, heat loss), vasoconstriction (blood vessel constriction, blood flow to the skin, heat loss), and shivering (skeletal muscle contractions, friction in the body, heat in the body). (p. 13)

5.4 Learning Assessments Learning was assessed using a pre‐test–post‐test design with two parts: a set of computer‐based exercises and a set of paper‐and‐pencil questions. (p. 14)

5.5 Log File Analysis (p. 18)

The log files provided the information required to calculate a measure of task performance for each student. By tracking the evolution of a student’s causal map, we could compute how the student’s causal map score changed over time. The map score at any point in time is calculated as the number of correct links (i.e., links that appear in the expert map) minus the number of incorrect links in the student’s map. (p. 18)

The log files also served as input to CA, which automatically calculated the following metrics for each student: 1. Edit Frequency: the number of causal link edits and annotations made by the student divided by number of minutes that the student was logged onto the system. 2. Unsupported edit percentage: the percentage of causal link edits and annotations not supported by any of the previous views that occurred within afive‐minute window of the edit. 3. Information viewing time: the amount of time spent viewing either the science resource pages or Betty’s graded answers. Information viewing percentage is the percentage of the student’s time on the system classified as information viewing time. 4. Potential generation time: the amount of information viewing time spent viewing information that could support causal map edits that would improve the map score. To calculate this, we annotated each hypertext resource page with information about the concepts and links discussed on that page. Potential generation percentage is the percentage of information viewing time classified as potential generation time. 5. Used potential time: the amount of potential generation time associated with views that both occur within a prior five‐minute window and also support an ensuing causal map edit. Used potential percentage is the percentage of potential generation time classified as used potential time. (p. 19)

Metrics one and two capture the quantity and quality of a student’s causal link edits and annotations, where supported edits and annotations are considered to be of higher quality. Metrics three, four, and five capture the quantity and quality of the student’s time viewing either the resources or Betty’s graded answers. These metrics speak to the student’s ability to seek and identify information that may help her build or refine her map (potential generation percentage) and then utilize information from those pages in future map‐editing activities (used potential percentage). (p. 19)

As a complement to the CA‐derived metrics, we employed an information‐theoretic differential sequence mining approach (Kinnebrew, Mack, Biswas, & Chang, 2014) to analyze students’ action sequences further. This allowed us to identify sequential action patterns that best differentiated groups of students defined by the CA‐derived metrics. The analysis considered the following student actions: 1) reading resource pages; 2) adding, removing, or editing causal links in the map (further distinguished by whether or not the edit improved the map score); 3) asking Betty to answer causal questions; 4) having Betty take quizzes; 5) asking Betty to explain her answer to a question; 6) creating, editing, and viewing notes; and 7) annotating links to keep track of their correctness. The derivation of action definitions from raw activity logs is discussed further in (Kinnebrew et al., 2013). (p. 20)

interesting. wondering how this ‘information-theoretic differential sequence mining’ works. (p. 20)

6 RESULTS (p. 20)

6.2 Relationships between CA-Derived Metrics, Learning, and Performance (p. 21)

Students edited their maps fairly often (0.60 times per minute), and on average, 55.7% of these edits were supported. Students spent roughly one‐third of their total time viewing information, but only 65.3% of this viewing time was spent on information that could support causal map edits. Students used, on average, a majority of the potential that they generated (62.3%), and they were mostly engaged in their learning (88.8%). (p. 21)

The results show that several of the CA metrics were significantly and moderately‐to‐strongly correlated with students’ best map scores. Best map scores were positively correlated with edit frequency (r = 0.56), potential generation percentage (r =0.41), and used potential percentage (r = 0.60). Best map scores were negatively correlated with unsupported edit percentage (r = –0.49) and disengaged percentage (r = –0.46). Students’ CA metrics were also correlated with their short answer learning gains. More specifically, short answer learning gains were positively correlated with edit frequency (r = 0.40), potential generation percentage (r = 0.26), and used potential percentage (r = 0.26). (p. 23)

To investigate these relationships further, we conducted multiple regression analyses to predict each of the learning gain measures and the best map score with the six CA metrics. The CA metrics predicted best map scores (F=22.87, p <0.001, R2 = 0.601) and gains on short answer items (F=4.544, p <0.001, R2 = 0.231) with statistical significance. (p. 23)

Together, these analyses provide potential insight into why particular students experienced more or less success. Negative correlations between unsupported edit percentage and information viewing percentage, potential generation percentage, and used potential percentage along with the positive correlation between unsupported edit percentage and disengaged percentage may suggest abehaviour profile characterized by disengagement, effort avoidance, and/or a difficulty in identifying causal links in the resources. (p. 23)

In summary, these results provide support for our first hypothesis. Students’ CA‐derived metrics were collectively predictive of their short answer learning gains (p. 23)

To test our second hypothesis — that students’ prior levels of task understanding would predict their CA‐derived metrics while using Betty’s Brain — we calculated correlations between CA metrics and students’ pre‐test skill levels (6). Most correlations are weak. However, some specific pre‐test skill levels were weakly and significantly correlated with the CA metrics. Students with higher causal reasoning scores edited their maps more often (r = 0.23) and used proportionally more of the potential they generated (r = 0.27). (p. 24)

To summarize, we found only limited support for our second hypothesis. Students who were better at interpreting causal relations in text passages during the pre‐test edited their maps somewhat more frequently and exhibited slightly higher levels of coherence (p. 24)

6.3 Exploratory Clustering Analysis (p. 25)

Figure 8 illustrates the resulting dendrogram. The analysis revealed five relatively distinct clusters containing 24, 39, 5, 6, and 24 students. (p. 25)

Cluster 1 students (n=24) may be characterized as frequent researchers and careful editors; these students spent large proportions of their time (42.4%) viewing sources of information and did not edit their maps very often. When they did edit their maps, the edit was usually supported by recent activities ( (unsupported edit percentage = 29.4%). Most of the information they viewed was useful for improving their causal maps (potential generation percentage = 71.4%), but they often did not take advantage of this information (used potential percentage = 58.9%). (p. 25)

Cluster 2 students (n=39) may be characterized as strategic experimenters. These students spent a fair proportion of their time (33.5%) viewing sources of information, and, like Cluster 1 students, often did not take advantage of this information (used potential percentage = 62.6%). Unlike Cluster 1 students, they performed more map edits, a higher proportion of which were unsupported, as they tried to construct the correct causal model. (p. 25)

Cluster 3 students (n=5) may be characterized as confused guessers. These students edited their maps fairly infrequently and usually without support. They spent an average of 58.9% of their time viewing sources of information, but most of their time viewing information did not generate potential (potential generation percentage = 45.8%). (p. 25)

Cluster 4 (n=6) may be characterized as disengaged from the task. On average, these students spent more than 30% of their time on the system (more than 45 minutes of class time) in a state of disengagement. (p. 25)

Cluster 5 students (n=24) are characterized by a high edit frequency (just over 1 edit per minute), and most of these students’ edits (70.9%) were supported. Additionally, they spent just over one‐third of their time viewing information, and over three‐fourths of this time viewing information that generated potential. These students are distinct from students in the other four clusters in that they used a large majority of the potential they generated (82.0%) and were rarely in a state of disengagement (3.1%). In other words, these students appeared to be engaged and efficient. (p. 26)

To explore more detailed behaviour differences between the identified clusters, we employed the information‐theoretic differential sequence mining approach described in Kinnebrew, Mack, Biswas, & Chang (2104) and the Log File Analysis section above. This approach identified the action patterns that best differentiate the five clusters, the top seven of which are presented in Table 106 . (p. 27)

Table 10. Means (and standard deviations) of activity pattern frequency by cluster (p. 29)

All of the other top differential activity patterns described in Table 10 involve a combination of map editing (specifically with respect to incorrect links) and quizzing, patterns that have been associated with successful performance in Betty’s Brain (Kinnebrew et al., 2013). For example, patterns 2 and 6 are characteristic of supported map edits based on quiz results. In particular, pattern 6 is characteristic of exploring the quiz results more deeply by viewing Betty’s answer explanation. Patterns 3, 5, and 7 are characteristic of using quizzes to monitor progress. After the student edits the map, they have Betty take a quiz in order to evaluate the effect of that edit on her quiz performance. Pattern 4 combines these two pattern types into a pattern characteristic of an edit‐and‐check strategy, in which students add a link to their map, use a quiz to monitor the effect of that link on Betty’s performance, and then, upon discovering the link is incorrect, remove it from their maps. (p. 29)

All six of these activity patterns display similar relative use across the clusters, having the highest average frequency in engaged and efficient students, moderate frequencies in researchers/careful editors and strategic experimenters, and low frequencies in confused guessers and disengaged students. (p. 29)

Altogether, these results indicate that students in more successful clusters were more likely to employ behaviours illustrating productive uses of the quiz for solution assessment. (p. 30)

Pretty cool work.

so this work follows this ‘logic’: SRL is important and we have good theoretical models based on decades of research. We have Betty’s Brain as an ‘open-ended learning environment’ (note that’s different from open learning environments), where SRL is key for learner success. Coherence Analysis now kicks in as a novel approach to assessing SRL, based in a SRL model and situated in Betty’s Brain; CA derives a bunch of indices that are predictive of learning performance measured by knowledge tests in this study. These indices are then used to cluster students. Sequence mining is then applied to further interpret different clusters (incorporating temporal information extracted from log analysis).

so this work follows this ‘logic’: SRL is important and we have good theoretical models based on decades of research. We have Betty’s Brain as an ‘open-ended learning environment’ (note that’s different from open learning environments), where SRL is key for learner success. Coherence Analysis now kicks in as a novel approach to assessing SRL, based in a SRL model and situated in Betty’s Brain; CA derives a bunch of indices that are predictive of learning performance measured by knowledge tests in this study. These indices are then used to cluster students. Sequence mining is then applied to further interpret different clusters (incorporating temporal information extracted from log analysis). (p. 30)

Kinnebrew, J. S., Loretz, K., & Biswas, G. (2013). A contextualized, differential sequence mining method to derive students’ learning behavior patterns. Journal of Educational Data Mining, 5(1), 190– 219.

Kinnebrew, J. S., Mack, D. L. C., Biswas, G., & Chang, C. K. (2014). A differential approach for identifying important student learning behavior patterns with evolving usage over time. Proceedings of the Mobile Sensing, Mining and Visualization for Human Behavior Inference Workshop at the 18th Pacific‐Asia Conference on Knowledge Discovery and Data Mining (pp. 281–292). Tainan, Taiwan. Retrieved from http://link.springer.com/chapter/10.1007%2F978‐3‐319‐13186‐ 3_27#page‐1

Roscoe,R.,Segedy,J.,Sulcer,B.,Jeong,H.,&Biswas,G.(2013).Shallowstrategydevelopmentinateachableagentenvironmentdesignedtosupportself‐regulatedlearning.Computers&Education,62,286–297.

Blog Logo

Bodong Chen


Published

Image

Crisscross Landscapes

Bodong Chen, University of Minnesota

Back to Home