Bodong Chen

Crisscross Landscapes

Notes: Berners-Lee. (2006). Creating a science of the Web



Citekey: @Berners-Lee2006

Berners-Lee, T. (2006). Creating a science of the Web. Science, 313(5788), 769–771. doi:10.1126/science.1126902



Understanding and fostering the growth of the World Wide Web, both in engineering and societal terms, will require the development of a new interdisciplinary field. (p. 1)

lyzes the natural world, and tries to find microscopic laws that, extrapolated to the macroscopic realm, would generate the behavior observed. Computer science, by contrast, though partly analytic, is principally synthetic: It is concerned with the construction of new languages and algorithms in order to produce novel desired computer behaviors. Web science is a combination of these two features. The Web is an engineered space created through formally specified languages and protocols. However, because humans are the creators of Web pages and links between them, their interactions form emergent patterns in the Web at a macroscopic scale. These human interactions are, in turn, governed by social conventions and laws. Web science, therefore, must be inherently interdisciplinary; its goal is to both understand the growth of the Web and to create approaches that allow new powerful and more beneficial patterns to occur. (p. 1)

If we want to model the Web; if we want to understand the architectural principles that have provided for its growth; and if we want to be sure that it supports the basic social values of trustworthiness, privacy, and respect for social boundaries, then we must chart out a research agenda that targets the Web as a primary focus of attention. (p. 1)

When we discuss an agenda for a science of the Web, we use the term “science” in two ways. Physical and biological science ana- (p. 1)

However, it turns out that human topics of conversation on the Web can be analyzed by looking at a matrix of links (7, 8). (p. 2)

The engineering challenge is to allow independently developed data systems to be connected together without requiring global agreement as to terms and concepts. (p. 2)

Leading Web researchers discussed the scientif ic and engineering problems that form the core of Web science at a workshop (p. 2)

Despite excitement about the Semantic Web, most of the world’s data are locked in large data stores and are not published as an open Web of inter-referring resources. (p. 2)

Substantial research challenges arise in changing this situation: how to effectively query an unbounded Web of linked information repositories, how to align and map between different data models, and how to visualize and navigate the huge connected graph of information that results. In addition, a policy question arises as to how to control the access to data resources being shared on the Web. (p. 2)

One particular ongoing extension of the Web is in the direction of moving from text documents to data resources (see the figure). (p. 2)

Although computer and information science have generally concentrated on the representation and analysis of information, attention also needs to be given to the social and legal relationships behind this information (9). Transparency and control over these complex social and legal relationships are vital, but require a much betterdeveloped set of models and tools that can represent these relationships. (p. 2)

The Web yesterday and today. (Left) The World Wide Web circa 1990 consisted primarily of text content expressed in the Hypertext Markup Language (HTML), exchanged via the hypertext transfer protocol (HTTP), and viewed with a simple browser pointing to a Universal Resource Locator (URL). (Right) Users of the Web now have a variety of top-level tools to access richer content including scalable vector graphics, the Semantic Web, multimodal devices (e.g., voice browsers), and service descriptions. These are expressed in extended markup language (XML), exchanged by newer protocols [e.g., HTTP 1.1 and SOAP (simple object access protocol)] and are addressed by uniform resource identifier (URI) schemes. (p. 2)

In the Web of human-readable documents, natural-language processing techniques can extract some meaning from the human-readable text of the pages. These approaches are based on “latent” semantics, that is, on the computer using heuristic techniques to recapitulate the intended meanings used in human communication. By contrast, in the “Semantic Web” of relational data and logical assertions, computer logic is in its element, and can do much more. (p. 2)

Web science is about more than modeling the current Web. It is about engineering new infrastructure protocols and understanding the society that uses them, and it is about the (p. 2)

The need for better mathematical modeling of the Web is clear. (p. 2)

creation of beneficial new systems. It has its own ethos: decentralization to avoid social and technical bottlenecks, openness to the reuse of information in unexpected ways, and fairness. It uses powerful scientific and mathematical techniques from many disciplines to consider at once microscopic Web properties, macroscopic Web phenomena, and the relationships between them. Web sci- (p. 3)

ence is about making powerful new tools for humanity, and doing it with our eyes open. (p. 3)

  1. S.Brin,L.Page,inProceedingsofthe7thInternational World Wide Web Conference (Elsevier Science, Amsterdam, 1998), pp. 107–117. 8. Z.N.Oltvai,A.-L.Barabási,Science298,763(2002). (p. 3)