Bodong Chen

Crisscross Landscapes

Notes: Ciccarese (2013) Web annotation as a first-class object



Citekey: @Ciccarese2013

Ciccarese, P., Soiland-Reyes, S., & Clark, T. (2013). Web annotation as a first-class object. IEEE Internet Computing, 17(6), 71–75. doi:10.1109/MIC.2013.123


A thorough explanation of the history of web annotation.


The W3C Open Annotation Data Model provides facilities for annotating content directly on the Web without changing the original content.Third-party semantic annotations of content are now emerging as rst-class objects on the Web. (p. 71)

Scholars have made handwritten notes and comments in books and manuscripts for centuries. Today’s blogs and news sites typically invite users to express their opinions on the published content; URLs let them share Web resources with accompanying annotations and comments using third-party services such as Twitter or Facebook. These contributions have until recently been constrained within speci c services, making them useful, but second-class citizens of the Web. (p. 71)

Now, Web annotations are emerging as fully independent linked data in their own right, no longer restricted to plain textual comments in application silos. Annotations can range from bookmarks and comments, to ne-grained annotations of a selected section of a frame in a video stream. Technologies and standards already exist to create, publish, syndicate, mash up, and consume nely targeted, semantically rich digital annotations on practically any content: annotations have become rst-class Web citizens. This development is driven by a need for collaboration and annotation reuse among domain researchers, computer scientists, scienti c publishers, and scholarly content databases. (p. 71)

What do these applications have in common? First, the annotations they support have a shared subject, typically identi ed by a URI (for example, to a webpage, image, or video). Second, these annotations live within a single system, at either the resource provider or a third-party bookmarking site. (p. 71)

Up until recently, no common annotation model has existed independently of what is being annotated. (p. 71)

Freeing Annotation from Content Silos (p. 71)

Although this might not pose a signi cant problem for ordinary social media users, it’s a big problem in digital scholarship. This challenge is driving a signi cant change in how we can implement and reuse annotations across systems and domains. (p. 71)

Content annotation in one form or another is a basic, de ning part of Web 2.0. (p. 71)

Semantics, Not Just Comments (p. 72)

In July 2013, Google announced the availability of 800 million documents annotated by 11 billion Freebase concepts.1 This is semantic tagging. (p. 72)

Using formal vocabularies specied in the Web Ontology Language (OWL), we can create richer, more computationally tractable structures than by using dictionaries or encyclopedias. Formal ontologies relate elements in one domain to another, using formal predicates, and let us perform automated reasoning over these knowledge structures. They might also supplement, identify, or tag dictionary and encyclopedia entries themselves. (p. 72)

machine assistance to reading, comprehension, and knowledge accumulation is increasingly necessary (p. 73)

Semantic tagging or linking is now a reality, and extends beyond Google and Freebase. Perhaps its most important application is in scholarship. For example, the Europe PubMed Central database of scienti c articles (http:// now semantically tags all its entries with protein, gene, and chemical compound names found in the text. But dealing with these tags raises further important questions. (p. 73)

How do we represent tags and other annotations? Are they injected in the text, or stored separately? How do we share them across systems? (p. 73)

by injecting targeted elements from separate annotation stores. When a user accessed a webpage, any annotations of that page could be loaded from and published to separate Annotea servers, one of which the W3C formerly hosted (and which is now decommissioned). Annotea consisted of an RDF schema with an accompanying REST API for annotation discovery and publication. (p. 73)

And how can we make this representation live as linked data on the Web — as a rst-class citizen? (p. 73)

These capabilities are now becoming available to Web developers using linked data annotation tools and models. (p. 73)

it laid out a basic path that subsequent systems followed and exploited. (p. 73)

The Open Annotation Model (p. 73)

The Evolution of Web Annotation (p. 73)

Building on RDF and partially inspired by Annotea, two parallel projects were begun independently in 2009: the Open Annotation Collaboration (OAC) and the Annotation Ontology (AO). Both models adopted Annotea’s fundamental structure: an annotation with a target (annotated resource) and a body (content) that concerns that target. (p. 73)

More than a decade ago, seminal lines of research on the Distributed Link Service (DLS)2 and Conceptual Open Hypermedia (COHSE)3 explored an early form of Web annotation. Their main goal was to enhance collaboration via shared metadata-based Web annotations, bookmarks, and their combinations. (p. 73)

The Annotea data model was lightweight. Annotations were stored separately from the documents they annotated, and consisted of three parts. An Annotation instance annotates a Web resource by associating it with a body — another Web resource — such as a comment or reply (typically XHTML embedded as an XML literal). The annotation had provenance properties such as author, created, and modi ed, which specialized corresponding Dublin Core elements. Finally, a context could indicate the speci c selection within an annotated document, typically using an XPointer. Several subclasses of Annotation were de ned — for instance, Question, Comment, or Example. (p. 73)

Around the same time, Annotea, an early W3C project, enabled collaboration by sharing annotations and metadata,4 which were attached to a full or partial webpage. (p. 73)

OAC was designed to support digital humanities, whereas AO was geared toward biomedical use cases (p. 73)

Linked Data Annotation Tools (p. 74)

Several systems already use annotations to produce linked data. (p. 74)

DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text (http:// (p. 74)

The Domeo Annotation Toolkit ( lets users trigger external text mining services or algorithms and transform their results into annotations that human agents can further curate (see Figure 1). (p. 74)

Utopia Documents (http://getutopia. com) is a PDF reader that, when opening a research article, can overlay manipulable visualizations of the data behind the article, providing data analysis tools and links to external resources. (p. 74)

OAC and AO took similar enough approaches to be able to mergeintoasingleproject (p. 74)

The Open Annotation Model (OA; see the sidebar)6 combined and elaborated these prior projects under the umbrella of the W3C Open Annotation Community Group (www. The OA model incorporates features from its predecessors, but with further exibility and richness, catering to a wider range of annotation needs. (p. 74)

Another valuable example of annotation and linked data is Maphub (, an online application for exploring and annotating digitized, high-resolution (p. 74)

historical maps (see Figure 2). (p. 75)

OA is new: we’ve never before had a comprehensive, fully featured model for linked data annotations, with signi cant community buy-in. As applications and annotation communities fully adopt this model for their work, linked data annotations will become rst-class, fully interoperable Web citizens — primary Web content in their own right. (p. 75)