Good Practice in Web Page Authoring

Contents

  1. Introduction
  2. Writing Style and Textural Content
  3. Valid HTML
  4. Structural Markup
  5. Style Sheets
  6. Images and Backgrounds
  7. Use of Links
  8. Browser Compatibility
  9. Usability and Accessibility

1. Introduction

This page isn't meant as anything like a complete guide to writing web pages. A basic knowledge of web and HTML concepts is assumed, so it doesn't even start from the beginning. It also doesn't include much on the aesthetic considerations of web design. It is instead intended to help relative beginners get to grips with some of the technical aspects of writing web pages, mainly by raising issues and providing pointers to other resources. Direct advice is given where it is felt to concern a particularly important and/or widespread issue. Some reasoning is usually provided, but full justifications are often left to the linked resources. If you don't know your attributes from your elements, you should perhaps read about some basic HTML concepts, or try this introduction to HTML first.

I always write my web pages using a simple text editor (e.g. Windows' Notepad). This is the only way you can guarantee having complete control over the HTML code you produce. The advice in this article was originally written with HTML 4.01 and CSS 1 in mind, though it is also applicable to XHTML and CSS 2. Browser support for the extra features (or rather user adoption of standards compliant browsers) is growing. For more information about the different versions, you could read this history of HTML.

Three web sites that deserve special mention are those of the World Wide Web Consortium (W3C), the Web Design Group (WDG), and HTML Source. These sites contain loads of useful information for beginners and experts alike, and they have been frequently referenced below. For a longer and more detailed document, along similar lines to this page, you might also like to read a message to clueless website authors. There's also Constance Petersen's Web Commandments: Ten Deadly Sins and How to Overcome Them.

2. Writing Style and Textural Content

The web is very different from traditional print media, and authors need to be aware of this when writing content for presentation on the web. In particular, pages should be concise and scannable. Bullet points and meaningful sub-headings should be well-used to make pages more useful to users.

Also, unlike other media, there's no real reason to omit information on the grounds that it might get in the way or distract users from more important points. Web users expect to be able to find all the information they need right there on your site, and want to avoid having to phone, write to, or email somebody to get anything that's missing. You can certainly overload a given page, but with suitable use of headings, sections, and sub-pages it should be possible to arrange all the necessary information in a logical and easy-to-use manner.

Several good resources with further details are available from Jakob Nielsen's Writing for the Web pages. The short article How Users Read on the Web should be considered essential reading for all web content authors, as should Web Design from Scratch's Introduction to Writing Copy for the Web.

3. Valid HTML

The official HTML standards are produced by the World Wide Web Consortium (W3C). With the exception of Internet Explorer, most modern browser versions do quite a pretty good job of rendering web pages according to these standards. Unfortunately, IE still has a depressingly large market. Nevertheless, you should still always use valid HTML when writing web pages. In particular, make sure that you nest tags properly (last opened, first closed), and be aware that there are restrictions on which elements may be placed inside others. Every HTML document should contain a DOCTYPE declaration (usually on the first line) identifying the HTML version being used. It is also important to specify the character encoding used in the document. The WDG has more information both on choosing a doctype and on using character encodings. A List Apart has an article with some further information about doctypes and browser modes.

It is good practice always to include optional end tags (e.g. </li> and </p>), and to ensure text is not written directly inside the body element, but is enclosed within another block-level element (e.g. <p>). Element and attribute names should always be written entirely in lowercase letters. These all become mandatory for XHTML, so apply them for forwards compatibility, if nothing else. Equally, all attribute values should be be quoted (with either single or double quotes). Even in HTML this is sometimes necessary; the most common example being colour values containing a '#' - not that you should have colour values in your HTML mind...

Even though some browsers (like Internet Explorer) will allow you to get away without doing so, chevrons (<, >) and ampersands (&) intended to appear literally on a web page must always be written with their respective character references: &lt;, &gt;, and &amp;. This includes ampersands in text attributes and URLs. The trailing semicolon isn't always required, but the rules are complex, so it's safest always to include it. The WDG has more information about character entities. Be aware that the (decimal) numeric references are usually better-supported than the named equivalents, especially for those names that have only been introduced into the specifications recently. However, if the entity is not supported by a browser, it is usually displayed literally. In which case, the names will degrade more intelligibly than the numbers.

The W3C Markup Pages contain lots of useful information and links, including full specifications for HTML 4.01, and XHTML 1.0. The specs may be a bit daunting to beginners, who might prefer something like the WDG's HTML Reference instead. The W3C's Markup Validation Service is an excellent way to check the validity of your pages.

4. Structural Markup

HTML is intended primarily to provide structural markup rather than visual formatting information. Indeed many of the presentation elements and attributes, including the <font> tag, have been removed from the latest HTML standards. Conceptually, HTML should provide structural information identifying headings, paragraphs, quotes, lists, emphasised text, etc. Suggestions on how browsers should format these for display should mostly be confined to separate document called a style sheet (see below).

Do not use the <font> element. To change the font for a complete block of text, you should apply a style sheet rule to the element delimiting the block. Authors should make use of the phrase elements, which are based on structural context rather than formatting style. This means using <em> and <strong> instead of <i> and <b> for emphasising text, and <code> in place of <tt> for computer code excerpts. Browsers should format text inside these elements in an appropriate way, though style sheets can be used to over-ride the default behaviour. There are also a few other phrase elements, which are useful in technical contexts. See the WDG's phrase elements reference for more details.

When an appropriate structural element is available you should us it, even if it doesn't provide the visual formating you want by default - this is left to the style sheet. For example, lists should always be marked up with <ul> or <ol>, quotations with <q> (inline) or <blockquote> (block level), and the phrase elements above used when appropriate.

Conversely, it is important not to use structural tags simply for their formatting effects. For example, <blockquote> should be used only for quotation blocks, and not simply to indent text. If you want to apply special formatting to something with no content-specific element, then use <div> (for blocks) or <span> (for in-line phrases) in conjunction with a style sheet.

5. Style Sheets

Cascading style sheets (CSS) give instructions to browsers on how they should handle the visual formatting of your web page. Not only are they the correct place for such style information (see above), they also make life easier for web designers. For instance, if you want all of your second-level headings to have dark blue text, then you can set the default colour to be blue using a single style sheet command, and avoid having to put <font color="#000080"> at the start of each one. Not only will this make your pages faster to load, but also if you ever decide to change the default colour, you only have to change it in one place.

The basic CSS rule-set consists of a selector followed by a declaration block containing zero or more (semicolon separated) declarations. Each declaration comprises a property name, followed by a colon and then its value. The initial selector identifies which HTML elements the rule-set will apply to. Like in HTML, extra white space is irrelevant. As an example, we might use the following to specify dark blue italic second-level headings:

h2 { color : #000080 ; font-style : italic }

Such rules are usually placed in an external style sheet, though may also be placed within the <style> element inside the <head> of an HTML document. A third option is to include a list of declarations in the style attribute of individual HTML tags.

Further details (suitable for beginners) can be found in the WDG Style Sheets Pages, and the W3C Style Sheets Guide. The more advanced might be interested in css.maxdesign.com.au, which gives lots of useful examples and tutorials, especially for styling lists and floats. For the hard-core there are also the full specifications for CSS level 1 and CSS level 2. You can check the validity of any style sheets you write with the W3C's CSS Validation Service. Discussion of issues surrounding browser support for style sheets is postponed until the browser compatibility section.

6. Images and Backgrounds

A web page consisting of just plain text can be rather boring, so authors often feel the need to splash lots of images around. However, images can take a long time to download, and do not scale well for different resolution screens. Try to avoid the temptation to use background images; busy images make the text on top hard to read. Also avoid using images which are not strictly necessary, e.g. those just containing text such as 'buttons' for links, and headings.

All <img...> tags must include an alt attribute. The purpose of this is to provide text to replace the image in text-only and non-visual browsers. It will also allow users to find out a bit more about what they're waiting for, while it downloads. The value of alt should be alternative text, rather than a description of the image (the place for the latter would be the title attribute). Purely decorative images should have alt="", and will then be ignored by non-visual browsers. If important information is present in an image, make sure it is also accessibly to those who can't see the image. If the values of alt and title aren't enough, then a URL can be provided in the longdesc attribute, as a link to a full description. Useful articles for further reading include A. Flavell's use of alt texts in imgs, J. Korpela's guidelines on alt texts in img elements, and Nomensa's This Isn't Just alt Text… .

It is advisable always to include the height and width attributes in each <img...> tag. This way the browser can leave the right amount of space and get on with laying out the rest of the page while the image is loading. If you fail to do this, the text can jump around all over the place as the images load. If you must use background images, make sure that the text on top is sufficiently contrasting, and that the background colour is set so that the text can still be read before the image has loaded.

Links typically take users to new pages without confirmation. Try to make sure your users have a good idea what they will get by clicking on a link. Is it a different section of the current page, another page within your site, or on a different site altogether? Think very carefully before getting pages to open in a new window. A user who knows roughly where the link is going should be in a position to decide for themselves whether they want a new window or not, and a decent browser should allow them to implement this choice.

Link phrases (i.e. the text between <a...> and </a>) need to be chosen carefully. They should generally make sense when read out of context (so don't use 'click here'), and should provide as much information as possible about what's at the other end. To include further information without disrupting the flow of the text, you can make use of the title attribute of the <a> element. The content of the title attribute is commonly displayed as a tool tip in visual browsers. To maintain consistency, a link consisting solely of someone's name should be to their personal home page or biography, rather than an email link. Such mailto: links should be made obvious, so that the user realises that clicking it might start up some email software.

It should be obvious which things on a page are links, and which are not. The standard formatting for links is blue underlined text, and your users will expect this. Be careful about changing things too far away from this standard, or using similar formatting for things that aren't links. In particular, I would recommend never using underlining for non-links; there are plenty of other ways to provide emphasis.

More information about good practice in links can be found in J. Nielsen's using link titles... 'altertbox' article, and the navigation mechanisms section of the Web Content Accessibility Guidelines. I also have a companion essay Website Navigation Mechanisms, which goes into more detail.

8. Browser Compatibility

Fortunately, the days of Netscape 4 and its buggy behaviour with CSS are long gone, and Internet Explorer 5 is pretty much on the way out too. Annoyingly though, the most popular browser, Internet Explorer 6, still fails to provide anything like full support for the current (X)HTML and CSS standards. Thankfully however, the growing use of alternatives such as Mozilla Firefox, Opera, and Apple's Safari has forced a re-think by Microsoft, and hopefully things will improve with a combination of better standards support in IE7 and further uptake of alternatives.

By coding to the standards you should be able to get the desired effects straight away in Firefox, Safari, and Opera, and you can then worry about applying any tweaks necessary to get acceptable behaviour in IE. A website should at least be readable and usable in any browser, even if it isn't pixel perfect. If users want everything to look spick and span they can always upgrade to a browser with better standards support. When writing a site, do check how it looks in as many different browsers (including older versions) as you can. There are still bugs lurking even in the most recent versions of the standards-compliant alternatives to IE, as the Acid2 Test demonstrates. Also think about the screen resolutions your audience might be using; is your site still usable with a resolution of 800 by 600?

A handy trick for applying content or CSS only to Internet Explorer is to use IE Conditional Comments. These allow HTML code to be inserted in what standards-compliant browsers will interpret (correctly) as a comment, but still have that code executed by Internet Explorer. For example, to include an IE-specific style-sheet, you would write:

<!--[if IE]>
<link rel="stylesheet" type="text/css" href="ie.css" />
<![endif] -->

The site QuirksMode contains handy discussions of the various incompatibilities of popular browsers, and how to get round them.

9. Usability and Accessibility

Websites are meant to be used, so design them with this in mind. Give your site logical structure, and use a consistent layout and design for all the pages. Provide an obvious and consistent link from each page back to your homepage. Also identify the hierarchy of sections to which the page belongs. This will help users see their current position within your site, especially for those who haven't entered your site via the homepage. For larger sites, it is a good idea to include a site-map, index, and/or search facility. (For further details see the companion essay Website Navigation Mechanisms.)

Admittedly, there will be very few people with serious special browsing needs (e.g. visual impairment, mobility problems). However, much of making sites accessible is using proper structural markup, and designing things to be clear; both of which are good practice and should benefit all users. Relevant comments on images and links have also been made above. Note that colour-blindness affects nearly 10% of men, so it should certainly be considered when choosing images and colour schemes. Others areas for attention include providing summary information and full structural markup for tables, and providing language information for pages and elements.

Avoid the temptation to redefine the basic font sizes for documents, users will (or at least should) have set their browsers to have default font size they're happy with, and won't like sites which change it. Do not try to prevent browsers from resizing fonts (e.g. by using images to display text, or by defining font sizes in points or pixels). Never use the <blink> or <marquee> elements. They make text hard to read, and are not part of the HTML standards. Also, avoid using block capitals for emphasis. Perhaps they might be acceptable as a styled heading, but uppercase letters are harder to read, and will disrupt the flow of text if used in the middle of paragraphs.

Jakob Nielson write articles on web usability. The W3C Web Accessibility Initiative (WAI) has produced a set of Web Content Accessibility Guidelines, or for a more readable tutorial you could try Dive into Accessibility. The VisCheck site discusses colour-blindness and provides utilities for simulating its effects, and improving images for colour-blind people.