2503ICT: Introduction to Web standards


The purpose of this section is to review and extend your knowledge of HTML and related technologies. The notes are very sketchy; refer to a text or to W3C recommendations for details.

References

Defining Web standards

HTML examples

The previous, 1999 HTML 4.01 standard is now obsolete and all future development work should be done using the still-developing HTML5 standard.

("View Source" is your friend.)

Note that every valid document must have a DOCTYPE declaration and should have a title and a charset specified.

HTML5 constructs

Learn to use common tags and constructs (sections, headings, paragraphs, lists, tables, forms, images, text styles, etc.).

All of these tags have optional attributes. Attribute values should be quoted. All presentation properties should be specified in style sheets.

Page layout

There is widespread agreement between popular Web site developers about page layout. They all have the following structure:

All developers should follow this standard practice for page layout, even when using a single-column layout.

Multicolumn layout can be specified using framesets, tables or CSS. Each has respective advantages and disadvantages, but only CSS is currently considered good practice.

CSS

We discuss multicolumn layout using CSS below.

HTML Forms

Forms are the fundamental construct for passing user-entered data to a server-side script. Each form must specify how data is passed to the script (the method) and which script is to receive the data (the action). Each form consists of one or more elements, including:

Consider this example form for getting personal details.

The two methods of submitting a form are GET and POST. GET sends the data in the URL; POST sends the data in the HTTP request. GET should be used when you are querying the server and POST should be used when you are updating the server (or the database server). Later we'll see other HTTP verbs.

To submit form data securely using POST (or GET), it suffices to specify an action with a URL of the form https://... This uses the secure socket layer (SSL) over HTTP to transparently encrypt the HTTP request and response. No other change to the HTML or server-side script is required.

HTML5

HTML5 is under active development and much of it is implemented in modern browsers, all of whose developers have committed to implementing all of it.

A list of HTML5 differences from HTML4 may be a more convenient introduction to HTML5.

HTML5 improves on HTML 4.01 in the following areas:

All HTML documents submitted in this course must be valid HTML5 documents.

HTML authoring principles

XHTML Documents

XHTML documents are XML documents corresponding to HTML documents. During the early 2000s, the W3C strongly advocated the use of XHTML instead of HTML, but developers equally strongly resisted.

It now seems that the future of HTML is HTML5, not XHTML, and even W3C concedes this.

Today, XML is used for representing data, not for representing Web pages.

Cascading Style Sheets (CSS)

For each element type, or element type of a given class or id, it is possible to specify various attributes of the element. In particular, we can specify font families, weights, and styles, element (and background) colours, element indentations, element paddings, and element positions.

It's particularly useful to specify the presentation of <div>s (paragraph level elements) and <span>s (in-line elements) using class and id attributes.

The collection of such specifications can go in an external style sheet, such as wp.css, and be included in multiple documents with a linkelement such as

<link rel="stylesheet" type="text/css" href="wp.css">

in the head of the document. This is the recommended way to specify styles within a document.

An alternative way is to use an embedded style sheet, also in the head of a document:

<style type="text/css">
/* level 1 headings are in bold font */
h1 { font-weight: bold; }
p  { font-family: verdana, sans-serif: color: blue; background-colour: white; }
/* text in class="alert" elements has colour red and size large */
.alert { color: red; font-size: large; }
/* text in paragraphs in elements with class="body" have style italic */
.body p { font-style: italic; }
/* text in elements with id="footer" have weight bold, style italic */
#footer { font-weight: bold; font-style: italic; }
/* tables have solid thin blue borders */
table { border: solid thin blue; }
</style>

Embedded styles override any preceding (external) styles.

Finally, styles may be used inline:

<h1 style="color: red">Heading 1</h1>

Inline styles override any preceding (external or embedded) styles.

Style rules that occur later in a style sheet override rules that occurred earlier in the sheet.

This overriding is why cascading style sheets are called "cascading".

Stylesheets are capable of much more than this, including the specification of absolute and relative positions of elements on a page, allowing them to replace the use of tables. In particular, the float position indicates that other elements may flow around this one.

For example, style sheets can be used to specify multi-column layouts, avoiding the use of frames (which can't be bookmarked) and tables (which are very inaccessible).

Here is a simple example of the same two-column layout as above using CSS floating and positioning. In this case, you must study the page source and the external CSS file to understand the example.

Here is a more complex example. Again, you need to study the external CSS file.

Note Not all browsers implement CSS (or HTML, or JavaScript) fully or properly (IE8?) and unfortunately this requires extra work to ensure sites display properly on all browsers. This problem is due to the browser developers, not to the CSS standard. To keep this course simple, you may ignore IE8 and restrict yourself to recent, standards-complient browser versions (e.g., Firefox, Opera, Safari and Chrome).

Good solutions to the two- or three-column layout problem should allow footers that goes across the page, should work correctly whichever column is longest, and should work reasonably when when font sizes are changed and when the page is resized. These are difficult goals to achieve.

Iinformation of how to specify such two- and three-column layouts using style sheets are described in Chapter 24 of Web Design in a Nutshell, 3rd edition by Neiderst Robbins and in the article CSS Positioning 101 by Noah Stokes on the excellent A List Apart Web site.

Many freely usable examples of CSS templates suitable for different applications are provided at Free CSS Templates.

The css Zen Garden contains many more examples of how different style sheets may render the same document in very different ways.

Laboratory exercises explore the use of CSS. We discuss CSS and positioning in particular in more detail later in the course.

SASS is an extension to CSS that simplifies the task of writing complex style sheets - we will use SASS throughout.

Twitter Bootstrap and Zorb Foundation are client-side frameworks that simplify the task of specifying presentation and behaviour for HTML pages. (We will not use these.)

HTTP Basics

HTTP is the protocol used for communicating between a browser (the client) and the Web server (the server). The browser sends a request; the server sends a response. Requests are primarily text messages; responses may be a mixture of text and binary data.

A typical HTTP request from a client has the form:

GET http://www.gu.edu.au/index.html HTTP/1.1
User-Agent: Mozilla/5.1 (Windows 2000;U) Opera 6.0 [en]
Accept: image/gif, image/jpeg. text/*, */*

Here the first line describes the method, the URL and the protocol. The second line describes properties of the browser, and the third line specifies what MIME types the browser accepts. This header information ends with a blank line.

Possible methods are HEAD, GET, POST, PUT and DELETE. Not all servers implement PUT and DELETE. (See the discussion of RESTful Web Services below.)

If a form is being submitted using method GET, the URL will be extended with data from the form, for example:

show_details.php?name=Steve&age=25

(Here, we are using a relative URL, with a query part, as opposed to an absolute URL which would also include the protocol and domain name, and show_details.php is the PHP script that processes the form data.)

if a form is being submitted using method POST, the data will be transmitted as part of the HTTP request, possibly encrypted, after the header.

The response from the server starts with a header:

HTTP/1.1 200 OK
Date: Wed, 12 Mar 2003 01:30:17 EST
Server: Apache 1.3.22 (Unix) mod_perl/1.26 PHP/4.2.0
Content-type: text/html
Content-length: 141

Here, 200 is the status code and the content type is an HTML document. Other important content types would indicate plain text, image files, audio files, and other multimedia data. This header information ends with a blank line and is followed by the requested data, in this case an HTML document.

Other possible 3-digit status codes that can be returned are:

The official definition of the HTTP protocol is RFC2616, which may be found at the RFC Editor site.