Update: The CritSuite Toolset Project has been completed. This page is now part of an archive of CritSuite web pages. The domain http://crit.org no longer belongs to this project or to Foresight Institute. For current information on CritSuite, please see the site maintained by the author of the software, Ka-Ping Yee:
  Enhancing the World Wide Web: Social Software for the Evolution of Knowledge  

M1+ Implementation Proposal

Foresight Web Enhancement Project
Ka-Ping Yee, 26 May 1997

This document is copyright 1997 by Ka-Ping Yee. All rights reserved.


  1. Introduction
  2. Requirements
  3. Objectives
  4. Design Overview
    1. Mediator Operation
    2. Link Types
    3. Information Gathering
    4. Rapid Publishing
    5. Limitations
  5. Deployment
    1. Link Representation
    2. Rapid-Publishing Markup
    3. Limitations
  6. Implementation
    1. Language
    2. Internal Architecture
      1. Module m1conf
      2. Module m1http
      3. Module m1lex
      4. Module m1gather
      5. Module m1linkdb
      6. Module m1url
      7. Module m1ui
      8. Module m1med
      9. Module m1req


1. Introduction

In an attempt to get the ball rolling, I am trying to scope out the high-level design but also to specify as many details as possible. So this is really three documents in one. Sections 2 and 3 comprise a requirements and objectives specification. Section 4 is a design specification. Sections 5 and 6 form an implementation outline. Each of these three components is written to be considered independently: any one of a number of designs might meet our objectives (but here's one way to do it) and any one of a number of implementations might conform to this design (but here's one way to do it).

Credit for ideas should go to everybody. Specific attributions are pointed out where possible.

2. Requirements

The required functionality components I propose are

  1. the ability for a reader to see, in any given document on the Web, the existence and location of links constructed by people other than the author of that document
  2. the ability to represent links between any two documents, at least of the type where the link comes from an entire document and references a fine-grained target
  3. the ability to store such links in a fashion that does not need to reside within both documents
  4. the ability to assign to such links some amount of additional metainformation (such as a link type)

Or, in fewer (and familiar) words, I suggest that we want bidirectional, fine-grained, extrinsic, typed links in this implementation.

Note that, for now, we are leaving out filtered links and multi-ended links (the Web need not be just a graph — it might eventually be a hypergraph). We are not implementing correlation-based article filtering and rating (yet). We are also explicitly omitting HyperWave(TM) functionality such as collection management, ownership rights, payment, searching/indexing, and universal resource naming (not to mention a 3-D landscape viewer).

3. Objectives

The proposed objectives of this project, in decreasing priority, are:

  1. to implement the requirements in a manner that maximizes accessibility to current WWW users
  2. to keep the implementation flexible enough to allow rapid prototyping and extension
  3. to achieve decent performance for a fairly small number of users (e.g. under 10000 requests per day)
  4. to have a working implementation ready quickly (e.g. by the end of June)

Potential goals beyond the essential requirements include:

The implementation to be described addresses only the first of these additional goals, but any implementation should leave doors open enough to allow them all.

4. Design Overview

The design to be described specifically addresses requirements a, c, and d. Requirement b (link representation) is left to the implementation.

The general proposal to scan and acquire links from the Web (rather than keeping a separate database of annotations) is due to Mark Miller, who named it M1. Augmented with the annotation-posting facility to be described in Section 4.4, I will refer to this design as M1+. (This design is also described in a bit more detail than M1 and involves some further decisions that Mark may or may not agree with...)

4.1. Mediator Operation

Central to this design is a mediator program through which readers must fetch their documents. My definition of a mediator is a program which (a) operates as both an HTTP server and a client, much like a proxy, (b) performs some useful processing on the document, and (c) modifies hyperlinks within documents so that further documents in the user's browsing session will continue to be requested via the mediator. Using my terminology, in this model the client makes a request to the mediator, which then contacts the origin server to fetch the raw document; the mediator produces a processed document which it serves to the client.

While proxying is a little inconvenient for users and leads to some performance limitations, it seems to be the only way to provide universal accessibility. A pseudo-proxy, though it involves still more processing, is more flexible than a plain HTTP proxy because it can be reached by users inside firewalls (and perhaps its use is easier to describe since it avoids any browser-specific details).

The M1+ mediator is responsible for fetching any document from the publicly-accessible Web, processing its HTML text so as to present bidirectional link information, and sending the result on to the user.

4.2. Link Types

It is proposed that links remain directed, i.e. asymmetric. Every link consists of two locations, a head and a tail, where the tail must always lie within a document authored by the creator of the link. The tail is the location of the reference and the head is the location referred to, in analogy to the head and tail of an arrow.

It is proposed that links be allowed to be either coarse (referring to an entire document) or fine (referring to a specific passage in a document) on either the head or the tail.

(For comparison, both previous annotator implementations only supported fine-grained heads and coarse-grained tails. Usenet supports visible heads and visible tails, both coarse-grained. Most current Web browsers implement only fine-grained tails. Some truly advanced browsers such as Lynx, Traveler, and my hacked copy of Grail also support coarse-grained tails. The Web in general allows fine-grained or coarse-grained heads, but link heads are never visible. To make matters worse, fine-grained heads are essentially made useless except in only very few browsers like Grail which actually implement them sensibly.)

It is further suggested that people be encouraged to classify each link as one of four types:

  1. query (request for clarification)
  2. comment (ancillary information or a related topic)
  3. support (positive argument)
  4. issue (correction, adjustment, or negative argument)

where the type name indicates the relationship of the tail object to the head object. For example, many of the existing "go check out this neat site!" kinds of links on the Web might be classified as "comments".

These basic semantic link types are drawn from suggestions by Dean Tribble, Eric Drexler, and others.

4.3. Information Gathering

It is a requirement of this design that all link information be represented in HTML in the tail documents themselves. This keeps the link information public, which may later aid scalability. (The manner of representation of link types and granularities in HTML is left to the implementation in Section 5.1.)

While processing any requested document, the mediator scans it for this link information according to the chosen syntax, and stores it for future use. When a document containing known link heads is requested, the mediator inserts reverse links — in effect, filling in both ends of a link so that current web browsers can make use of them. Since the most popular web browsers only support visibility of fine-grained link tails, we require the addition of coarse-grained link tails as well as both fine-grained and coarse-grained link heads.

In the original design, the mediator could also periodically refetch documents for examination (either from a list of known documents which users edit, or from the history of previously requested documents). This is perhaps an implementation issue, but I believe that scanning only requested documents ought to be enough for now.

4.4. Rapid Publishing

The + in M1+ refers to one extra feature that is not in the essential requirements, which is the ability for users to rapidly publish simple documents when they wish to annotate a given page or contribute to a discussion without having to create an HTML document on their own site. This helps to make up for the earlier restriction that all links must reside in public HTML documents authored by the link creator.

To allow for rapid publishing the mediator needs to add some user interface components which allow a reader to select a link head in the document being viewed, and then to enter a document for the link tail (effectively an annotation).

Documents entered in this fashion are stored on the mediating server at a publicly-accessible URL. In this design we do not require that they be editable after being posted. The implementation of this feature will likely include a mechanism for adding some simple markup to the plain entered text (see Section 5.2).

4.5. Limitations

The use of a mediator makes it impossible for anyone to annotate documents that are inside firewalls. It also may potentially yield many problems with frame documents (a specific implementation for dealing with frame documents is possible, but I propose to leave that problem for later).

By encouraging people to write links in their own HTML documents, we alleviate the need for people to get special software to do authoring, but we also open ourselves to an anarchy of link types (or to lazy authors who never bother to classify their links).

5. Deployment

5.1. Link Representation

HTML was designed to allow REL and REV attributes on links (which stand for "relationship" and "reverse relationship" respectively). We can use these, even though most browser manufacturers have totally ignored them. Note that REL is defined to give the relationship of the head object to the tail object. Since the link types we described in the design (see Section 4.2) give the relationship of the tail to the head, we ask authors to use our link types with the REV attribute.

Links with coarse-grained tails are represented in HTML using the <LINK> tag within the <HEAD> element of a document. Again very few browsers bother to implement this, but we can use it.

Links with fine-grained tails are represented in HTML, as they are now, as text flows contained in an anchor element marked by the <A> and </A> tags.

Links with coarse-grained heads are represented in HTML, as they are now, as document URLs given in the HREF attribute of a <LINK> tag or an <A> tag.

Links with fine-grained heads are represented as document URLs with a fragment identifier. We use the existing convention that a hash (or number, or pound) sign separates the requested URL from the fragment identifier. But we must invent a new scheme to be used in the identifier itself, since HTML only supports fine-grained heads where the head anchor is explicitly marked in the head document. This degrades — well, about as well as you could ever ask, since the unmediated link will still point at the correct document (coarsely).

The scheme is based on a finding a search string for context and then identifying the portion of that search string which is to form the head anchor of the link. The search string is to be sought in the HTML text data after tags have been removed. Tags or any amount of whitespace in the head document become word breaks. Any punctuation is removed, leaving only letters or digits in each word. This scheme requires that link anchors contain only entire words, not parts of words.

Since URLs may not contain spaces, we stipulate that the identifier consists of the words in the search string separated by hyphens. Searches are case-sensitive. Finally, we enclose the anchor portion of the string in parentheses. An example of a reference to the words "may not contain spaces" at the beginning of this paragraph would be


The two namespaces are completely separate, since current fragment identifiers are SGML NAME identifiers, which are restricted to letters, digits, and underscores. Both parentheses and hyphens are legal in URLs but illegal in SGML NAMEs. The remaining punctuation characters which happen to be legal in URLs, according to RFC 1738, are

+  _  =  .  /  *  ,  @  '  $  :  ;  &  !  ?

which are all available for later use to represent more interesting kinds of search strings.

(Instead of the hyphen, any word delimiter would do — such as the underscore or the period — but hyphens are pretty easy to type.)

5.2. Rapid-Publishing Markup

When the user enters a comment for rapid publishing, the comment is entered as plain text. The body of the comment is automatically marked up according to the following conventions:

As well as the comment body, the user is also asked to supply the title of the new document. These elements are organized into a complete HTML document posted on the mediator's server (probably with a filename based on the document title).

5.3. Limitations

The search string identifier scheme requires that all link head anchors be text strings. It is not possible to make a link referring to (for example) an image in another document, although you could make a link to a phrase of text including the image. This could be alleviated by adding some search syntax for finding specific tags.

6. Implementation

6.1. Language

My preference is to implement the system in Python, to allow easy object-oriented work, facilitate modularity and code reuse, and provide quick prototyping abilities. Python's extensibility allows us to later implement more speed-critical parts of the code in C if necessary.

6.2. Internal Architecture

The code is to be divided into the following main modules (presented here roughly in order of decreasing depth):

Information flows from the HTTP client through the lexical analyzer to the user interface, which uses the link database and URL rewriting utilities to add value. The information also flows from the lexical analyzer to the link gatherer, which saves detected links in the link database. This information is a series of tokens, each of which may be either text data or metadata (such as a representation of an HTML tag or an indicator token for some other purpose). Metadata tokens also contain state information (such as the HTML tag stack).

For maximum performance the data is processed with a streaming model which deals with information piecemeal as it comes over the wire from the origin server. This streaming model is supported by a category of stream-processing objects, all of which have a method to accept a list of incoming data tokens and contain a reference to the next stream-processing object in the chain. Each such object maintains its own queue of data, releasing tokens along the chain only when their state is completely determined. (For example, if the lexical analyzer sees a tag-opening <, it should queue data until it has found the end of the tag.)

The document mediator module is responsible for calling upon the HTTP client to provide the raw document to the user interface. The top-level request handler interfaces the document mediator with the outside world. Although the request handler module may in future be a complete HTTP server of its own, for the first implementation it can be just a CGI script.

The rapid publishing utility is independent of the mediator and may be implemented as a separate CGI script. It is expected that this program will also use the m1linkdb module to manipulate the link database, and get its configuration information from the m1conf module. It will also incorporate a module named m1markup which is responsible for accepting a plain-text document and applying rapid-publishing markup.

6.2.1. Module m1conf

This module would contain configuration information such as the host server's name and address, ports to bind to, relevant URLs, and other string constants.

6.2.2. Module m1http

This module opens and manages an HTTP client connection to a specified origin server, generating a stream of text data as it comes over the wire. The interface to this module includes:

class HTTPSession
Constructed with a complete HTTP request, an object of this class tries to produce the requested document. It may maintain its own cache to draw on. If it does not get the document from the cache, it creates a new connection to an origin server and makes the request. It must parse the HTTP header. If the content type is not HTML, it produces binary-data tokens that are passed through the other stream processors untouched. If the contained document is HTML, all content data received is sent to the next stream processor as text tokens. Any errors or other conditions are indicated by sending metadata tokens down the stream.

6.2.3. Module m1lex

This module is responsible for breaking a stream containing an HTML document into text and tag tokens, while maintaining a tag stack. To correctly maintain the tag stack, this module needs to contain a small database of circumstances in which tags need to be implicitly closed or inserted.

class HTMLLex
Constructed with a reference to the next stream-processing object, an object of this class accepts HTML text data and produces a stream of text and tag tokens. Tag tokens produced by this object contain the tag stack state.

6.2.4. Module m1gather

This module gathers link information for the database.

class LinkGatherer
Constructed with a LinkDB, an object of this class provides a stream sink that accepts text and tag tokens. Incoming tag tokens which indicate links (i.e. <A> and <LINK> tags) are assembled into entries in the link database.

6.2.5. Module m1linkdb

This module manages a link database in permanent storage (likely implemented with a public-domain database library such as BSDDB or GDBM). The interface to this module includes:

class LinkDB
Allows direct reading and writing of the link database, for use by the annotation interface.
class LinkFinder
Represents a collection of link anchors which can be compared to a stream of data. As data is added to the stream, the method forwards as much data as possible with tokens inserted to represent anchor limits (beginning or ending points of anchors). An object of this class is constructed given a LinkDB and a particular URL for a target document.

6.2.6. Module m1url

This module provides a number of utilities for manipulating URLs — joining them, resolving them with respect to a given base URL, and creating extended URLs that reference the mediator.

6.2.7. Module m1ui

This module hopefully separates the user interface from the rest of the mediator's inner workings. The stream processing object in the user interface accepts the tags and tokens from a LinkFinder object, and turns the special tokens for link anchors into the desired user interface components (for example, linked dots, arrows, or buttons).

6.2.8. Module m1med

This module accepts a request for a document to be mediated, and sets up the appropriate objects and stream processors from the above classes to perform the processing. It may maintain its own cache of pre-processed documents.

6.2.9. Module m1req

This module deals with the details of CGI request parameters and produces a request for the mediator module to perform. It is the final receiver of data from the mediator, and dumps the data to the client.