I’m a big believer in form following function: understand your content, understand your users’ needs, understand your technological options, and form will naturally follow. But where search is concerned, form strongly drives searchability as function.
Until very recently, search engines have had to make a Heisenbergian choice. Heisenberg, you’ll recall, observed that when studying quantum particles, you could study position or momentum of a given particle, but not both. Position and momentum in quantum mechanics are determined in terms of probability, but this added another level of uncertainty to the problem. You could pick one or the other to measure, but not both.
In the same way, search engines could analyze the web to a high degree of precision, knowing a few pages intimately, or they could analyze on a large scale, knowing many pages on a limited basis. But Google has reached the point where it says both can be attained: knowing many pages, and knowing them well.
As a partial step toward this end Google and other search engines have agreed on XML protocol for site maps, and Google has begun offering internal domain search directly from their results for larger sites. Understanding the structure of a site provides a better context for guiding searchers to their goal.
As search engines become more precise, understanding the semantics of individual pages is likewise becoming more relevant. For those unfamiliar with the field, semantics is the study of linguistic meaning, not only of a word or phrase in and of itself, but of the meaning found in the relationships between words, sentences, phrases. Online semantics is the study of the meaning imposed on a page’s content by its metadata and structure (and the practice thereof).
Most web developers who think of semantics (if they think of it at all) think of metadata, primarily the meta tags found in the head portion of a page. Early in the internet’s development, these tags were viewed as highly important for searchability, but abuse of keywords diminished their importance to search engine algorithms and therefore their importance to developers. The other elements of semantics, particularly in the body section of the page, were frequently misunderstood or not attended to at all. For example, h tags are frequently used out of order merely for their graphic impact; and CSS makes it easy to transform the functionality of many tags intended for completely different purposes. (The dynamic menus on alexfiles.com, for example, are based on CSS enhancement of definition tags, another semantic tag. Don’t worry, I’m following my own advice and changing them soon!)
So here are updated semantic recommendations, gleaned not only from a decade of SEO experience, but also from working sessions with several search consultants (including Google, in 2006 and 2007) over the past year and a half.
If you want to test your semantics, the W3C offers a semantics validator. (Try it! Copy the URL for this page and insert that into the validator form.)
Head section optimization
Browser titles
Browser titles are strong indicators of the content of a page. To optimize their impact, browser titles should
- Be individual to each page as much as possible (this may be difficult with catalogs and the like).
- Go in reverse breadcrumb order, from specific to broad. For example, a retail catalog page might have the following browser title:
Blouses and Tops | Women’s Clothing | Store name
- The browser title should be different from the h1 tag in the body section.
Metadata
Make your meta tags (keywords, description, etc.) as specific as possible. Make them individual to the pages. Do not make them overly long; a brief sentence or two should suffice for your description.
A note: try to avoid the meta refresh tag. The meta refresh tag has been abused so by unscrupulous designers that search engines dismiss them. Instead of using a meta refresh for a permanently changed URL, use a hard 301 redirect. Use a 302 for temporary changes.
Title tags
Your title tag should be unique to your page, reflecting its subject matter.
Body section optimization
H1 tags
Every page should have a single H1 tag. This defines the subject of the page, and should be specific to the page.
The H1 tag is the page title, but should be different from the browser title. The browser title serves as a reverse breadcrumb, putting the page content within the larger context of the site; the H1 tag is the smallest, most specific portion of that segment, although the wording does not need to match exactly. It tells the reader if they’re in the right place.
H tags in general
If your page requires structure beyond the top level, then turn to H tags. H1 – H6 tags are the increments of your outline, their inherent size clues to the reader as they move from broad to detailed information.
Many developers and sites use H tags as graphic tools, not outlines. Sites can be found with no H1 tag; with skips between the steps in tags (going from H1 to H3, for example); or with multiple H1 tags. With CSS and dynamic sites, H1 tags in a variety of appearances can be pulled into a page.
Among semantic-aware developers, there are a few misconceptions I’ve encountered.
- It is not necessary to go beneath H1 for searchability. Just make sure that if you use lower-H tags, they are used properly.
- Just as the H1 tag should not be the site name (don’t make all your H1 tags the store name, for example) the H1 tag also should not enclose a graphic. It should be readable text.
- It is not necessary to use all the H tags if you create an outline.
- It is not necessary to use the same number of tags at each level. You can have two H2 tags and one can have two H3 tags beneath it, while another has three. As long as they’re in order and don’t skip, you’re good.
Other tags
There are other semantic body tags, such as definition tags: DL (definition list), DT (definition term), and DD (definition defined). These have their own inherent formatting per the W3C, sometimes used (as I did) for other purposes. Your best bet is to simply familiarize yourself with the W3C descriptions, and use tags for their intended purpose. In this way the form of your page can enhance its functionality by making it possible for search engines to better understand it, and bring in more users.