2503ICT: Web Miscellany


Under construction!

This lecture presents a very brief, very selective discussion of alternative technologies, applications and enviroments.

Table of contents

XML-based applications

The extensible markup language XML is an important technoldogy for document representation, management and communication, and (in my opinion) a less important application for Web-based applications.

XML is a markup meta-language for representing labelled trees. XML is superficially similar to HTML but differs (a) in using an open-ended set of elements and attributes, (b) in describing content and structure only (omitting all presentation) and (c) in requiring stricter formatting (e.g. all tags must be closed and all attribute values must be quoted strings).

Both HTML 4.01 and HTML5 have XML variants denoted as XHMTL.

To use XML effectively requires knowledge of a large number of supporting standards and technologies:

Here is a more detailed summary of XML standards and technologies.

RSS and Atom

Particularly widely used XML Schemas are RSS and Atom, which describe the format of syndicated news feeds (for newspapers and Weblogs).

Here is a brief, single example of an Atom 1.0 feed document (from the Atom 1.0 specification):

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

    <title>My news feed</title>
    <subtitle>
        An informative, entertaining and critical view of current news.
    </subtitle>
    <link href="http://example.org/"/>
    <updated>2003-12-13T18:30:02Z</updated>
    <author>
	<name>John Doe</name>
	<email>johndoe@example.com</email>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>

    <entry>
	<title>Atom-Powered Robots Run Amok</title>
	<link href="http://example.org/2003/12/13/atom03"/>
	<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
	<updated>2003-12-13T18:30:02Z</updated>
	<summary>Some text.</summary>
    </entry>

</feed>

Atom feeds can be much more complex than this example indicates. Here is a comparison between RSS 2.0 and Atom 1.0.

Many Web sites provide such RSS and/or Atom news feeds. Collections of news feeds are managed by news readers such as Google Reader (closing) and Bloglines (both Web-based), NetNewsWire (for Mac OS X) and BlogBridge (all platforms). Many browsers such as Opera and Safari can also be used directly as news readers. The use of news readers makes reading large volumes of information very efficient. The design of a news reader is an exercise in the application of the above set of XML standards and technologies.

AtomPub

The Atom Publishing Protocol (AtomPub) is a set of conventions for creating, editing and deleting feeds and feed entries. The AtomPub protocol is an important example of a Web service that is worth describing in a little detail.

An (AtomPub) collection is a resource whose representation is an Atom feed. The resource may be a blog or a newspaper. AtomPub defines a collection's response to GET and POST requests, and also allows PUT and DELETE requests.

A collection is a collection of (AtomPub) members. A member is an entry in an Atom feed, a weblog entry, a news article. a bookmark or, indeed, a multimedia object such as an image, an audio file or a video file.

(AtomPub) clients create members inside a collection by POSTing a representation of the member (e.g., an Atom entry) to the collection URL. The server assigns the new member to the collection, returns an HTTP 201 ("Created") code together with a Location header containing the URL of the new member.

Each collection may contain accept tags that describe what kind of members may be POSTed to the collection.

Summary of the AtomPub
Resource Method Representation Description
Member GET Atom Entry Retrieve the Atom representation of the entry.
Member PUT Atom Entry Update the member resource with the Atom entry representation.
Member DELETE Atom Entry Delete the member resource.
Collection GET Atom Feed Retrieve a list of the members in the collection. May be a subset.
Collection POST Atom Entry Create a new member resource with the given Atom Entry.

Note that a PUT request requires a URL, whereas a POST request does not require a URL and instead returns a URL (for the new member resource).

Current Web servers do not recognise the HTTP requests PUT and DELETE, so AtomPub clients are required as a "front-end" to the Web server. (This may change with HTML5.)

Many AtomPub clients and servers have been written and deployed.

They are used not only for news feeds and blogs, but also for many other application.

RESTful Web Services

REST (Representational State Transfer) is Web Service architecture based directly on HTTP (proposed by Roy Fielding, ca. 1995). State is transferred between client and server through URLs. The client can use all the HTTP verbs - POST, GET, PUT, DELETE, HEAD - to create, retrieve, update, delete and summarise server-side resources (identified by URLs).

References

Examples

Principles

RESTful Web Services use a Resource-Oriented Architecture (ROA). The ROA consists of just four concepts:

  1. Resources (physical objects)
  2. Their names (URLs)
  3. Their representations (e.g., XML/HTML/JSON documents)
  4. The links in representations of one resource to other resources

and four properties:

  1. Addressibility (every resource has an address, i.e., a URL)
  2. Statelessness (state is passed in representations of resources and in URLs)
  3. Connectedness (resources are connected by links in their representations)
  4. A uniform interface (normally HTTP)

The design of a RESTful Web Service requires you to:

  1. Identify the resources.
  2. Design the representations accepted from the client.
  3. Design the representations served to the client.
  4. Connect resources to each other.
  5. Decide what's supposed to happen in response to each request.
  6. Figure out what may go wrong and report it.

These principles can and should be applied to the design of all Web applications.

In particular, the design of the set of URLs that identify resources (and operations) in a (RESTful) Web application is good practice. For example, the following is a common pattern for identifying lists and individuals.

.../users/          # the list of all users
.../users/n/        # the user with id n

We can then use HTTP verbs to operate on these URLS:

GET .../users/      # retrieve a rep'n of the list of all users
POST ⟨rep'n of new user⟩ .../users/       # add a new user
GET .../users/n/    # retrieve a rep'n of the user with id n
PUT ⟨rep'n of updated user⟩ .../users/n   # update the user with id n
DELETE .../users/n/ # delete the user with id n

If you are restricted to using the GET and POST verbs in HTML forms, you may have to include the real verbs in the URLS:

.../users/          # retrieve all users
.../users/add/      # add a new user
.../users/n/        # retrieve user n
.../users/n/update/ # edit user n
.../users/n/delete/ # delete user n

Of course, this is only possible with modern Web application patforms such as Ruby on Rails or Django; when using PHP it is common to use less RESTful notation such as this:

.../users/show_all.php
.../users/add.php
.../users/show.php?id=n
.../users/update.php?id=n
.../users/delete.php?id=n

Note how much clearer the RESTful notation is. However, the syntax of URL patterns is not the most important feature of RESTful applications; it is the combination of the above features that makes it the best way to develop practical, maintainable Web applications.

The Semantic Web

See the W3C Semantic Web activity page.

"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001. See also The Semantic Web Revisited, IEEE Intelligent Systems, Jan. 2006.

The basic functional difference between the semantic Web and the current Web is that computer programs as well as humans will be able to access Web sites and services, to retrieve and use relevant information. However, the semantic Web goes beyond Web services by making data self-describing (using RDF), by allowing flexible terminology (using the ontology language OWL), and by enabling automated reasoning (using logic programming).

An extended example is given in the above article.

Key technologies that will enable the semantic Web to be implemented include:

See the above articles, and Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by T. Berners-Lee, with M. Fischetti (Harper San Francisco, 1999), for more information.

See the International Semantic Web Conference for recent research.

Web application development considerations

Understanding these considerations properly would require a complete second course.

Pre-constructed Web applications

Often, you don't need to build your own Web application from scratch!

In many cases you can download and configure an application and run it on your own server. In other cases, you can configure the application and run it on the provider's server. Examples include:

Nevertheless, (we believe that) understanding how to build Web applications helps you to install, use and adapt existing Web applications better.