This page isn't meant as anything like a complete guide to writing web pages. A basic knowledge of web and HTML concepts is assumed, so it doesn't even start from the beginning. It also doesn't include much on the aesthetic considerations of web design. It is instead intended to help relative beginners get to grips with some of the technical aspects of writing web pages, mainly by raising issues and providing pointers to other resources. Direct advice is given where it is felt to concern a particularly important and/or widespread issue. Some reasoning is usually provided, but full justifications are often left to the linked resources. If you don't know your attributes from your elements, you should perhaps read about some basic HTML concepts, or try this introduction to HTML first.
I always write my web pages using a simple text editor (e.g. Windows' Notepad). This is the only way you can guarantee having complete control over the HTML code you produce. The advice in this article was originally written with HTML 4.01 and CSS 1 in mind, though it is also applicable to XHTML and CSS 2. Browser support for the extra features (or rather user adoption of standards compliant browsers) is growing. For more information about the different versions, you could read this history of HTML.
Three web sites that deserve special mention are those of the World Wide Web Consortium (W3C), the Web Design Group (WDG), and HTML Source. These sites contain loads of useful information for beginners and experts alike, and they have been frequently referenced below. For a longer and more detailed document, along similar lines to this page, you might also like to read a message to clueless website authors. There's also Constance Petersen's Web Commandments: Ten Deadly Sins and How to Overcome Them.
The web is very different from traditional print media, and authors need to be aware of this when writing content for presentation on the web. In particular, pages should be concise and scannable. Bullet points and meaningful sub-headings should be well-used to make pages more useful to users.
Also, unlike other media, there's no real reason to omit information on the grounds that it might get in the way or distract users from more important points. Web users expect to be able to find all the information they need right there on your site, and want to avoid having to phone, write to, or email somebody to get anything that's missing. You can certainly overload a given page, but with suitable use of headings, sections, and sub-pages it should be possible to arrange all the necessary information in a logical and easy-to-use manner.
Several good resources with further details are available from Jakob Nielsen's Writing for the Web pages, and the short article How Users Read on the Web should be considered essential reading for all web content authors, as should Web Design from Scratch's Writing for the Web article
The official HTML standards are produced by the World Wide Web Consortium (W3C). With
the exception of Internet Explorer, most modern browser versions do
quite a pretty good job of rendering web pages according to these
standards. Unfortunately, IE still has a depressingly large
market. Nevertheless, you should still always use valid HTML when
writing web pages. In particular, make sure that you nest tags
properly (last opened, first closed), and be aware that there are
restrictions on which elements may be placed inside others. Every HTML
document should contain a DOCTYPE declaration (usually on
the first line) identifying the HTML version being used. It is also
important to specify the character encoding used in the document. The
WDG has more information both on
choosing a
doctype and on using
character encodings. A List
Apart has an article with some further information about doctypes and
browser modes.
It is good practice always to include optional end tags
(e.g. </li> and </p>), and to
ensure text is not written directly inside the body element, but is
enclosed within another block-level element
(e.g. <p>). Element and attribute names should
always be written entirely in lowercase letters. These all become
mandatory for XHTML, so apply them for forwards compatibility, if
nothing else. Equally, all attribute values should be be quoted (with
either single or double quotes). Even in HTML this is sometimes
necessary; the most common example being colour values containing a
'#' - not that you should have colour values in your HTML mind...
Even though some browsers (like Internet Explorer) will allow you
to get away without doing so, chevrons (<, >) and ampersands
(&) intended to appear literally on a web page must always be
written with their respective character references:
<, >, and
&. This includes ampersands in text attributes
and URLs. The trailing semicolon isn't always required, but the rules
are complex, so it's safest always to include it. The WDG has more information
about character entities. Be aware that the (decimal) numeric
references are usually better-supported than the named equivalents,
especially for those names that have only been introduced into the
specifications recently. However, if the entity is not supported by a
browser, it is usually displayed literally. In which case, the names
will degrade more intelligibly than the numbers.
The W3C Markup Pages contain lots of useful information and links, including full specifications for HTML 4.01, and XHTML 1.0. The specs may be a bit daunting to beginners, who might prefer something like the WDG's HTML Reference instead. The W3C's Markup Validation Service is an excellent way to check the validity of your pages.
HTML is intended primarily to provide structural markup rather than
visual formatting information. Indeed many of the presentation
elements and attributes, including the <font> tag,
have been removed from the latest HTML standards. Conceptually, HTML
should provide structural information identifying headings,
paragraphs, quotes, lists, emphasised text, etc. Suggestions on how
browsers should format these for display should mostly be confined to
separate document called a style sheet (see below).
Do not use the <font> element. To change the
font for a complete block of text, you should apply a style sheet rule
to the element delimiting the block. Authors should make use of the
phrase elements, which are based on structural context rather than
formatting style. This means using <em> and
<strong> instead of <i> and
<b> for emphasising text, and
<code> in place of <tt> for
computer code excerpts. Browsers should format text inside these
elements in an appropriate way, though style sheets can be used to
over-ride the default behaviour. There are also a few other phrase
elements, which are useful in technical contexts. See the WDG's phrase
elements reference for more details.
When an appropriate structural element is available you should us
it, even if it doesn't provide the visual formating you want by
default - this is left to the style sheet. For example, lists should
always be marked up with <ul> or
<ol>, quotations with <q>
(inline) or <blockquote> (block level), and the
phrase elements above used when appropriate.
Conversely, it is important not to use structural tags simply for
their formatting effects. For example, <blockquote>
should be used only for quotation blocks, and not simply to indent
text. If you want to apply special formatting to something with no
content-specific element, then use <div> (for
blocks) or <span> (for in-line phrases) in
conjunction with a style sheet.
Cascading style sheets (CSS) give instructions to browsers on how
they should handle the visual formatting of your web page. Not only
are they the correct place for such style information (see above),
they also make life easier for web designers. For instance, if you
want all of your second-level headings to have dark blue text, then
you can set the default colour to be blue using a single style sheet
command, and avoid having to put <font
color="#000080"> at the start of each one. Not only will
this make your pages faster to load, but also if you ever decide to
change the default colour, you only have to change it in one
place.
The basic CSS rule-set consists of a selector followed by a declaration block containing zero or more (semicolon separated) declarations. Each declaration comprises a property name, followed by a colon and then its value. The initial selector identifies which HTML elements the rule-set will apply to. Like in HTML, extra white space is irrelevant. As an example, we might use the following to specify dark blue italic second-level headings:
h2 { color : #000080 ; font-style : italic }
Such rules are usually placed in an external style sheet, though
may also be placed within the <style> element
inside the <head> of an HTML document. A third
option is to include a list of declarations in the style
attribute of individual HTML tags.
Further details (suitable for beginners) can be found in the WDG Style Sheets Pages, and the W3C Style Sheets Guide. The more advanced might be interested in css.maxdesign.com.au, which gives lots of useful examples and tutorials, especially for styling lists and floats. For the hard-core there are also the full specifications for CSS level 1 and CSS level 2. You can check the validity of any style sheets you write with the W3C's CSS Validation Service. Discussion of issues surrounding browser support for style sheets is postponed until the browser compatibility section.
A web page consisting of just plain text can be rather boring, so authors often feel the need to splash lots of images around. However, images can take a long time to download, and do not scale well for different resolution screens. Try to avoid the temptation to use background images; busy images make the text on top hard to read. Also avoid using images which are not strictly necessary, e.g. those just containing text such as 'buttons' for links, and headings.
All <img...> tags must include an
alt attribute. The purpose of this is to provide text to
replace the image in text-only and non-visual browsers. It will also
allow users to find out a bit more about what they're waiting for,
while it downloads. The value of alt should be
alternative text, rather than a description of the image (the
place for the latter would be the title
attribute). Purely decorative images should have alt="",
and will then be ignored by non-visual browsers. If important
information is present in an image, make sure it is also accessibly to
those who can't see the image. If the values of alt and
title aren't enough, then a URL can be provided in the
longdesc attribute, as a link to a full
description. Useful articles for further reading include
A. Flavell's use
of alt texts in imgs, J. Korpela's
guidelines on
alt texts in img elements, and Nomensa's
This
Isn't Just alt Text… .
It is advisable always to include the height and
width attributes in each <img...>
tag. This way the browser can leave the right amount of space and get
on with laying out the rest of the page while the image is loading. If
you fail to do this, the text can jump around all over the place as
the images load. If you must use background images, make sure that the
text on top is sufficiently contrasting, and that the background
colour is set so that the text can still be read before the image has
loaded.
Links typically take users to new pages without confirmation. Try to make sure your users have a good idea what they will get by clicking on a link. Is it a different section of the current page, another page within your site, or on a different site altogether? Think very carefully before getting pages to open in a new window. A user who knows roughly where the link is going should be in a position to decide for themselves whether they want a new window or not, and a decent browser should allow them to implement this choice.
Link phrases (i.e. the text between <a...> and
</a>) need to be chosen carefully. They should
generally make sense when read out of context (so don't use 'click
here'), and should provide as much information as possible about
what's at the other end. To include further information without
disrupting the flow of the text, you can make use of the
title attribute of the <a>
element. The content of the title attribute is commonly
displayed as a tool tip in visual browsers. To maintain consistency,
a link consisting solely of someone's name should be to their personal
home page or biography, rather than an email link. Such
mailto: links should be made obvious, so that the user
realises that clicking it might start up some email software.
It should be obvious which things on a page are links, and which are not. The standard formatting for links is blue underlined text, and your users will expect this. Be careful about changing things too far away from this standard, or using similar formatting for things that aren't links. In particular, I would recommend never using underlining for non-links; there are plenty of other ways to provide emphasis.
More information about good practice in links can be found in J. Nielsen's using link titles... 'altertbox' article, and the navigation mechanisms section of the Web Content Accessibility Guidelines. I also have a companion essay Website Navigation Mechanisms, which goes into more detail.
Fortunately, the days of Netscape 4 and its buggy behaviour with CSS are long gone, and Internet Explorer 5 is pretty much on the way out too. Annoyingly though, the most popular browser, Internet Explorer 6, still fails to provide anything like full support for the current (X)HTML and CSS standards. Thankfully however, the growing use of alternatives such as Mozilla Firefox, Opera, and Apple's Safari has forced a re-think by Microsoft, and hopefully things will improve with a combination of better standards support in IE7 and further uptake of alternatives.
By coding to the standards you should be able to get the desired effects straight away in Firefox, Safari, and Opera, and you can then worry about applying any tweaks necessary to get acceptable behaviour in IE. A website should at least be readable and usable in any browser, even if it isn't pixel perfect. If users want everything to look spick and span they can always upgrade to a browser with better standards support. When writing a site, do check how it looks in as many different browsers (including older versions) as you can. There are still bugs lurking even in the most recent versions of the standards-compliant alternatives to IE, as the Acid2 Test demonstrates. Also think about the screen resolutions your audience might be using; is your site still usable with a resolution of 800 by 600?
A handy trick for applying content or CSS only to Internet Explorer is to use IE conditionals. These allow HTML code to be inserted in what standards-compliant browsers will interpret (correctly) as a comment, but still have that code executed by Internet Explorer. For example, to include an IE-specific style-sheet, you would write:
<!--[if IE]>
<link rel="stylesheet" type="text/css" href="ie.css" />
<![endif] -->
The site QuirksMode contains handy discussions of the various incompatibilities of popular browsers, and how to get round them.
Websites are meant to be used, so design them with this in mind. Give your site logical structure, and use a consistent layout and design for all the pages. Provide an obvious and consistent link from each page back to your homepage. Also identify the hierarchy of sections to which the page belongs. This will help users see their current position within your site, especially for those who haven't entered your site via the homepage. For larger sites, it is a good idea to include a site-map, index, and/or search facility. (For further details see the companion essay Website Navigation Mechanisms.)
Admittedly, there will be very few people with serious special browsing needs (e.g. visual impairment, mobility problems). However, much of making sites accessible is using proper structural markup, and designing things to be clear; both of which are good practice and should benefit all users. Relevant comments on images and links have also been made above. Note that colour-blindness affects nearly 10% of men, so it should certainly be considered when choosing images and colour schemes. Others areas for attention include providing summary information and full structural markup for tables, and providing language information for pages and elements.
Avoid the temptation to redefine the basic font sizes for
documents, users will (or at least should) have set their browsers to
have default font size they're happy with, and won't like sites which
change it. Do not try to prevent browsers from resizing fonts (e.g. by
using images to display text, or by defining font sizes in points or
pixels). Never use the <blink> or
<marquee> elements. They make text hard to read,
and are not part of the HTML standards. Also, avoid using block
capitals for emphasis. Perhaps they might be acceptable as a styled heading,
but uppercase letters are harder to read, and will disrupt the flow of
text if used in the middle of paragraphs.
Jakob Nielson has a whole site – useit.com – dedicated to usability issues, including a regular e-column. The W3C Web Accessibility Initiative (WAI) has produced a set of Web Content Accessibility Guidelines, or for a more readable tutorial you could try Dive into Accessibility. There is also online utility called WebXACT (the successor to Bobby), which allow authors to check for potential accessibility problems with their pages. The VisCheck site discusses colour-blindness and provides utilities for simulating its effects, and improving images for colour-blind people. Finally, the Accessify Forum provides useful information, and an opportunity for discussion of web accessibility issues.