Citekey: @Aturban2015

Aturban, M., Nelson, M. L., & Weigle, M. C. (2015). Quantifying orphaned annotations in (pp. 15–27). doi:10.1007/978-3-319-24592-8_2


An empirical investigation of orphaned annotations in The results are clearly stated in the paper. This paper highlights the significant challenge faced by web annotations – the persistence of the first-order content that maintains the context of annotations.

(This paper has a light, useful intro to


In this paper, we investigate the prevalence of orphaned annotations, where a live Web page no longer contains the text that had previously been annotated in the annotation system (containing 6281 highlighted text annotations). We found that about 27 % of highlighted text annotations can no longer be attached to their live Web pages. Unfortunately, only about 3.5 % of these orphaned annotations can be reattached using the holdings of current public web archives. For those annotations that are still attached, 61% are in danger of becoming orphans if the live Web page changes. This points to the need for archiving the target of annotations at the time the annotation is created. (p. 15)

Haslhofer et al. [9] define annotation as associating extra pieces of information with existing web resources. Annotation types include commenting on a web resource, highlighting text, replying to others’ annotations, specifying a segment of interest rather than referring to the whole resource, tagging, etc. (p. 15)

In early 2013, Hypothes.is1, an open annotation tool, was released and is publicly accessible for users to annotate, discuss, and share information. It provides different ways to annotate a web resource: highlighting text, adding notes, and commenting on and tagging a web page. In addition to that, it also allows users to share an individual annotation URI with each other as an independent web resource. The annotation is provided in JSON format and includes the annotation author, creation date, target URI, annotation text, permissions, tags, comments, etc. (p. 15)

We find that 27% of the highlighted text annotations at are orphans, and only a few can be reattached using web archives. Further, we show that 61 % of the currently attached annotations could potentially become orphans if their live Web resources change, because there are no archived versions of the annotated resources available. (p. 18)

2 Related Work (p. 18)

Annotation has long been recognized as an important and fundamental aspect of hypertext systems [11] and an integral part of digital libraries [1], but broad adoption of general annotation for the Web has been slow. (p. 18)

There has been previous work in developing annotation systems to support collaborative work among users and in integrating the Open Annotation Data Model [15] with the Memento framework. (p. 18)

The Open Annotation Collaboration (OAC) [9] has been introduced to make annotations reusable through different systems like Before publishing OAC, annotations would not be useful if the annotated web pages were lost because annotations were not assigned URIs independent from the web pages’ URIs. By considering annotations as first-class web resources with unique URIs, annotation not only would become reusable if their targets disappear, but also would support interactivity between systems. (p. 18)

Sanderson and Van de Sompel [16] built annotation systems which support making web annotations persistent over time. They focus on integrating features in the Open Annotation Data Model with the Memento framework to help reconstructing annotations for a given memento and retrieving mementos for a given annotation. (p. 18)

In other work [8,14] researchers built annotation systems that can deliver a better user experience for specialized users and scholars. (p. 19) The interface allows users to create different types of annotations: (1) making a note by highlighting text and then adding comments and tags about the selected text, (2) creating highlights only, (3) adding comments and tags without highlighting text, and (4) replying to any existing annotations. (p. 19)

5 Conclusions (p. 25)

In this paper, we analyzed the attachment of highlighted text annotations in We studied the prevalence of orphaned annotations, and found that 27 % of the highlighted text annotations are orphans. We used Memento to look for archived versions of the annotated pages and found that orphaned annotations can be reattached to archived versions, if those archived versions exist. We also found that for the majority of the annotations, no memento exists in the archives. (p. 25)

Blog Logo

Bodong Chen



Crisscross Landscapes

Bodong Chen, University of Minnesota

Back to Home