Bodong Chen

Crisscross Landscapes

Notes: Linguistic Reflections of Student Engagement in Massive Open Online Courses



Wen, M., Yang, D., & Rose, C. P. (2014). Linguistic Reflections of Student Engagement in Massive Open Online Courses. In ICWSM’14.

Citekey: @Wen2014


This paper introduces a novel approach to use linguistic features to measure student motivation and cognitive engagement (two important aspects for learning) in MOOC discussion forum.


“Based on the learning sciences literature, we quantify a student’s level of engagement in a MOOC from two different angles: (1) displayed level of motivation to continue with the course and (2) the level of cognitive engagement with the learning material.” (p. 1)

“Automated fine-grained content analyses offer the potential to detect and monitor evidence of student engagement and how it relates to other aspects of their behavior.” (p. 1)

“As a methodological contribution, in this paper we investigate using computational linguistic models to measure learner motivation and cognitive engagement from the text of forum posts” (p. 1)

“Conversation in the course forum is replete with terms that imply learner motivation.” (p. 1)

“Our methodology allows us to differentiate better among these students, and to identify danger signs that a struggling student is in need of support within a population whose interaction with the course offers the opportunity for effective support to be administered.” (p. 1)

“In this paper, we attempt to automatically measure learner motivation based on such markers found in posts on the course discussion forums.” (p. 1)

“We measure this difference in cognitive engagement with an estimated level of language abstraction. We find that users whose posts show a higher level of cognitive engagement are more likely to continue participating in the forum discussion.” (p. 2)

“In this paper, we test the generality of our measures in three Coursera MOOCs focusing on distinct subjects. We demonstrate that our measures of engagement are consistently predicative of student dropout from the course forum across the three courses.” (p. 2)

“It is important to monitor learner motivation and how it varies along the course weeks. We propose to automatically measure learner motivation based on linguistic cues in the forum posts.” (p. 2)

“There are mainly three types of information available about the participation patterns of MOOC users: the survey information, the clickstream behavioral data, and the forum posts.” (p. 2)

“Poellhuber et al. (2013) find that user goals specified in the pre-course survey were the strongest predictors of later learning behaviors.” (p. 2)

“One exception is Ramesh et al. (2013), which uses sentiment and subjectivity of user posts to predict engagement/disengagement. However, neither sentiment nor subjectivity ended up being predictive of engagement in that work.” (p. 2)

“An interesting paper to check. Contradicts with our findings”

“Not sure whether it’s true”

“The level of a student’s motivation strongly influences the intensity of the student’s participation in the course.” (p. 3)

“4.1.1 Creating the Human-Coded Dataset: MTurk” (p. 3)

“We used Amazon’s Mechanical Turk (MTurk) to make it practical to construct a reliable annotated corpus for developing our automated measure of student motivation.” (p. 3)

“In order to construct a hand-coded dataset for training machine learning models later, we employed MTurk workers to rate each message with the level of learner motivation towards the course the corresponding post had.” (p. 3)

“Using MTurk is a cool idea, although researchers may challenge the validity of the analysis”

“LIWC-cognitive words (Table 1, line 3): The cognitive mechanism dictionary in LIWC (Pennebaker & King, 1999) includes such terms as “thinking”, “realized”, “understand”, “insight” and “comprehend”.” (p. 4)

“First person pronouns” (p. 4)

“Positive words (Table 1, line 5) from the sentiment lexicon (Liu et al. 2005) are also indicators of learner motivation.” (p. 4)

“Another paper by Wen actually raises concerns around this statement.”

“4.1.3 Linguistic Markers of Learner Motivation” (p. 4)

“we work to find domain-independent motivation cues so that a machine learning model is able to capture motivation expressed in posts reliably across different MOOCs.” (p. 4)

“Apply words” (p. 4)

“The Apply lexicon we use consists of words that are synonyms of “apply” or “use”: “apply”, “try”, “utilize”, “employ”, “practice”, “use”, “help”, “exploit” and “implement”.” (p. 4)

“Need words (Table 1, line 2) show the participant’s need, plan and goals: “hope”, “want”, “need”, “will”, “would like”, “plan”, “aim” and “goal”.” (p. 5)

“4.2.1 Measuring Level of Language Abstraction” (p. 5)

“Previous work measures level of language abstraction with Linguistic Inquiry and Word Count (LIWC) word categories (Gill & Oberlander, 2002; Pennebaker & King, 1999; Yarkoni, 2010; Beukeboom, 2013). For a broader word coverage, we use the automatically generated abstractness dictionary from Turney et al. (2011) which is publicly available.” (p. 5)

“The experiments in this section confirm that our theoryinspired features are indeed effective in practice, and generalize well to new domains. The bag-of-words model is hard to be applied to different course posts due to the different content of the courses.” (p. 5)

“4.2 Level of Cognitive Engagement” (p. 5)

“We believe that level of language abstraction reflects the understanding that goes into using those abstract words when creating the post.” (p. 8)