Citekey: @johnson2014ethics

Johnson, J. A. (2014). The ethics of big data in higher education. International Review of Information Ethics, 7:3–10.



The immediate challenge considers the extent to which data mining’s outcomes are themselves ethical with respect to both individuals and institutions. A deep challenge, not readily apparent to institutional researchers or administrators, considers the implications of uncritical understanding of the scientific basis of data mining. (p. 1)

These challenges can be met by understanding data mining as part of a value-laden nexus of problems, models, and interventions; by protecting the contextual integrity of information flows; and by ensuring both the scientific and normative validity of data mining applications. (p. 1)

The aim of data mining is to identify relationships among variables that may not be immediately apparent using hypothesis-driven methods. (p. 2)

Baker suggests four areas of application: building student models to individualize instruction, mapping learning domains, evaluating the pedagogical support from learning management systems, and scientific discovery about learners.6 Kumar and Chadha suggest using data mining in organizing curriculum, predicting registration, predicting student performance, detecting cheating in online exams, and identifying abnormal or erroneous data.7 More recent (p. 2)

applications have embraced such suggestions, exploring course recommendation systems, retention, student performance, and assessment.8 (p. 3)

data mining is gaining hold operationally at the institutional level, predicting student success and personalizing content in online and traditional courses; making Netflix-style course recommendations, monitoring student progress through their academic programs, and sometimes intervening to force student action; modeling campus personal networks and student behavior with an eye toward identifying lack of social integration and impending withdrawal from the institution based on facilities usage, administrative data, and social network data. Admissions and recruiting are also growth areas for data mining. 9 (p. 3)

Challenges of using Big Data in Higher Education (p. 3)

Consequentialism: The Immediate Challenge (p. 3)

The most prominent of these are the related problems of privacy and individuality. (p. 3)

consent (p. 3)

Data mining can create group profiles that become the persons represented, treating the subject as a collection of attributes rather than a whole individual and interfere with (p. 3)

treating the subject as more than a final predictive value or category.14 (p. 4)

This suggests that the students, far from being understood as individuals, are simply bundles of skills that need to be matched to an outcome. (p. 4)

undermine individuals’ autonomy (p. 4)

This is a classic example of paternalism, the “use of coercion to make people morally better.”18 (p. 4)

An especially complicated form of interference is the creation of disciplinary systems, wherein the control of minutiae and constant surveillance lead subjects to choose the institutionally preferred action rather than their own preference, a system that generally disregards autonomy. (p. 4)

Scientism: The Deep Challenge (p. 5)

In fact, the most difficult challenges may be ones of which institutional researchers are least aware. In the process of designing a data mining process, institutional researchers build both empirical and normative assumptions, meanings, and values into the data mining process. These choices are often obscured by a strong tendency toward scientism among data scientists. (p. 5)

Scientism is a trap that, if not avoided, can do substantial harm to students. But unfortunately, current examples of data mining in higher education have embraced, rather than rejected, scientism. (p. 5)

But one of the key recent findings in both the philosophy and the sociology of science is the value-ladenness of science and technology. (p. 5)

Vialardi and colleagues note that predictive analytic models “are based on the idea that individuals with approximately the same profile generally select and/or prefer the same things.”24 This very behaviorist model of human nature is at the foundation of every data model. (p. 6)

Practical Ethics for Ethical Data Mining (p. 6)

The ethical questions presented in data mining will be clearer when building a data mining model is situated in relation to the perceived need for the policy, the interventions that are proposed, the expected outcomes of the policy, and the ways in which the policy will be evaluated; problems such as incompatibilities between the assumptions of the data model and those of the intervention will only be apparent from this perspective. (p. 6)

But most importantly, where there are gaps in the reasoning researchers should identify the assumptions that allowed those gaps to be bridged uncritically and then subject those assumptions to critical analysis. (p. 6)

One approach to these problems that allows for a comprehensive analysis without an extensive technical background in ethics is to consider the contextual integrity of data mining practices.26 (p. 6)

it can be expanded to include respecting the integrity of the individual and of the university. As actors within the informational context, changes to how the actors understand themselves are equivalent to changes in the actors, and the actors’ goals and values are themselves part of the context whose integrity is to be maintained. (p. 7)

Blog Logo

Bodong Chen



Crisscross Landscapes

Bodong Chen, University of Minnesota

Back to Home