Bodong Chen

Crisscross Landscapes

Notes: Guo. (2014). Demographic Differences in How Students Navigate Through MOOCs



Citekey: @Guo2014-ve

Guo, P. J., & Reinecke, K. (2014). Demographic Differences in How Students Navigate Through MOOCs. In Proceedings of the First ACM Conference on Learning @ Scale Conference (pp. 21–30). New York, NY, USA: ACM.


A nice study exploring demographic differences (age, gender, country origin) among MOOC learners. Interesting findings that many learners skipped a significant portion of course content, navigated in a non-linear manner, etc., pointing to opportunities for future research. However, the interpretations of some results remain debatable.


This paper presents an empirical study of how students navigate through MOOCs, and is, to our knowledge, the first to investigate how navigation strategies differ by demographics such as age and country of origin. We performed data analysis on the activities of 140,546 students in four edX MOOCs and found that certificate earners skip on average 22% of the course content, that they frequently employ non-linear navigation by jumping backward to earlier lecture sequences, and that older students and those from countries with lower student-teacher ratios are more comprehensive and non-linear when navigating through the course. (p. 1)

From these findings, we suggest design recommendations such as for MOOC platforms to develop more detailed forms of certification that incentivize students to deeply engage with the content rather than just doing the minimum necessary to earn a passing grade. (p. 1)

less consideration on the actual of meaning itself, compared to the educational literature. (p. 1)

In particular, we hypothesized that there should be notable differences in how students navigate through the learning content depending on demographics such as age and country of origin. (p. 1)


xMOOCs have been criticized for simply replicating traditional lecture-based teaching [2] and not catering to different learning strategies, such as a student’s preference for either a more or less linear structure of instructional content [20]. (p. 2)

Differences in students’ approaches to learning are often described with the help of Witkin’s distinction between field-independent and field-dependent learners [24]. Fieldindependent learners predominantly approach the learning content in an analytic manner, focusing first on details before abstracting from a specific problem. In contrast, fielddependent learners first reason about context before focusing on details [24]. (p. 2)

Field-independent learners are believed to be fairly confident in defining their own learning paths and navigating in non-linear learning environments [9]. They are also often referred to as “explorers”, indicating that they navigate more freely without necessarily following the path suggested by content creators [17]. Field-dependent learners, in contrast, prefer following an externally defined learning path (p. 2)

Analysis Variables (p. 3)

Demographics (p. 3)

Researchers have found that, in countries with a higher student-teacher ratio, students are more accustomed to a teacher-centered education and behave more like “observers” than “explorers” [11]. (p. 3)


Data Set (p. 3)

Motivation & Intent (p. 3)

• Certificate earned – whether a student earned a certificate of completion or not. (p. 3)

We have made an anonymized version of our data set and analysis scripts publicly available at http://www.pgbovine. net/edX/ so that anyone can reproduce and build upon this paper’s findings. (p. 3)

• Grade – Students earn a grade between 0% and 100% depending on their performance on problem sets and exams. (p. 4)

• Coverage – The fraction of total learning sequences (lectures, problem sets, and exams) that the student visited. (p. 4)

• Discussion forum events – The number of times a student posted to the discussion forum, divided by that student’s total number of courseware access events, which controls for variability in student activity levels. (p. 4)

Navigation (p. 4)

We quantify the following kinds of non-linear navigation through the course materials: (p. 4)

• Backjumps – The number of times that this student navigated backwards from a learning sequence to another one released earlier in the term (e.g., from Lecture 6 to Lecture 4), divided by the total number of sequences visited by this student. (p. 4)

• Textbook events – The number of times that this student accessed the digital textbook associated with the course, divided by the student’s total number of courseware access events. Since the textbook still resides within the edX website but lies outside the main flow of a course, we count textbook events as an instance of non-linear navigation. (p. 4)

Analyses (p. 4)

we conducted multiple linear regression analyses and report their ANOVA F statistics, p-values, and, when applicable, the unstandardized b coefficients for each independent variable in the regression. For this paper, we do not analyze fine-grained navigation within a learning sequence. We also do not use time as a feature (p. 4)


We also found that age distributions vary by country. In particular, students from countries with lower student-teacher ratios seem to participate in MOOCs later in life than students from countries with higher student-teacher ratios (i.e., larger class sizes). (p. 4)

We found that age and country of origin have significant effects on the fraction of sequences that certificate earners cover in all four courses (see Table 4 for a summary of the F statistics). (p. 5)

Motivation & Intent (p. 5)

Age is positively correlated with coverage, even when accounting for other demographic variables (regression coefficients b = .001 − .01, p < .001 for all four courses1). Figure 3 visualizes how older certificate earners cover more of the learning content, with a 10% difference in coverage between the under-20 and the over-50-year-old groups. (p. 5)

Coverage (p. 5)

Table 4 also shows that a student’s country of origin has a statistically significant effect on their coverage. (p. 5)

Certificate-earning students viewed, on average, 67% of the learning sequences in 3.091x, 77% in PH207x, 81% in CS188.1x, and 86% in 6.00x. Thus, students ignore, on average, 22% of the materials in those courses, yet still earn certificates. (p. 5)

Table 5 shows that certificateearning students from countries with higher student-teacher ratios usually visit fewer learning sequences. (p. 6)

Course Navigation Strategy (p. 6)

Given that all certificate earners had a common intent – to earn a passing grade – we wanted to understand how they went about doing so, and whether navigation strategies differed by demographic. (p. 6)

The first part of the sentence needs to be warranted. (p. 6)

We focused on backjumps because the ability to go “back in time” to view prior lectures or to re-try old assessments is one key differentiator of MOOCs over traditional residential courses with in-person lectures and exams. (p. 6)

Certificate earners in our four courses performed an average of 1.04 backjumps for every learning sequence they visited (sd=0.62). (p. 6)

In contrast, students who did not earn certificates performed only 0.3 backjumps per visited sequence (sd=0.4). Thus, certificate earners repeat visiting prior sequences three times as often, presumably to review older content. (p. 6)

Kinds of backjumps (p. 6)

Each backjump can start and end at either an assessment (i.e., problem set or exam) or a lecture. (p. 6)

Students most frequently backjump from an assessment to a lecture, which shows that they might be opportunistically looking up specific information needed to answer assessment questions. The second most prevalent kind of backjump is between two lectures, potentially when students are re-watching old lectures for more in-depth learning. Such lecture-to-lecture backjumps occur significantly less frequently (30% of total backjumps averaged across four courses) than assessment-tolecture backjumps (54% across four courses) (independent ttest, T4.6 = 5.2, p < .01). (p. 7)

A student’s country of origin has a significant effect on the proportion of backjumps for three of the courses (p. 7)

student-teacher ratio had a negative correlation with backjumps, even controlling for other demographics and grades (p. 7)

Demographics and backjumps (p. 7)

Older certificate-earning students backjump more frequently. (p. 7)

researchers have previously called “explorers” [17] in that they do not necessarily follow a given path. (p. 8)

Digital textbook usage (p. 8)

The results in the prior sub-section indicate that older students and those from countries with a low student-teacher ratio perform more backjumps. One possible explanation is that this behavior is emblematic of more independent learning and their preferences for non-linear navigation. (p. 8)

This trend is further supported by differences in the number of textbook events between demographic subgroups. Textbooks events are another type of non-linear navigation, since the digital textbook for each course is located outside of the main flow of the course materials. Again, a multiple regression analysis showed that older certificate earners access the textbook more frequently (b = .00003 − .0001, p < .001 for all courses). (p. 8)


Our findings suggest that most students navigate through the learning content in a non-linear way. (p. 8)


A central limitation of this study is that by just analyzing log data, we cannot directly measure students’ true motivations, engagement, intent for enrolling in a MOOC, or their knowledge gained after passing the course. Coverage of learning sequences, as well as data on their certificates and grades, therefore served as proxies in our analyses. (p. 9)