Class Struggle

I’m irked by the fact that I keep running into web sites that use mark-up class names that follow the content language. The web and its mark-up are open to the world, to both humans and machines. For better or for worse in terms of mark-up, English is the lingua franca.

The Dutch government’s web guidelines on descriptive IDs and class names don’t explicitly state it as a requirement but does leave the impression through their examples that we can use a language other than English. This notion is in my opinion complete and utter nonsense and even harmful. For one, lumping class names and IDs together like that is to obscure their specific relevance to the structure and content. You may think that I’m splitting hairs but following the example of this ‘official’ guideline could lead to unmaintainable front-end.

In the web stack a class name is strictly speaking not part of the content and should from the outset be the same language as the mark-up. HTML is based on English and the metadata for that structure should at least also be in English. An ID on the other hand can be part of the content (ie URL hashes/http://en.wikipedia.org/wiki/Fragment_identifier[fragment identifier]) and when it is it ought to follow the content language of the current document. It can get a little messy, but there is method to the madness.

The key is that one is a classification and the other an identifier. A classification is best when it is a noun and has descriptive/semantic meaning. This semantic meaning is not directly related to the content but to its context. This is an important difference and why you shouldn’t base class names on the content language.

A common mistake is to add an identifying class name to a semantic structure:

(Dutch: ‘voorpagina’ = ‘front page’)

There are three mistakes here:

The header tag itself indicates both a hierarchy and semantic meaning so there is no need to classify or identify it directly.
The class name is specific to the content and in this case content location. Better would be to add ‘homepage’ to the body tag.
It’s very unlikely that a machine is going to be able assert that this header is on the front page.

It’s an easy fix:

A machine might be able to assert that this content is important through the use of the English word 'lead'.
The class name isn’t tied to the location of the article but does classify the type of article it is, or rather its hierarchy.
 This is also a way to highlight a sticky article without the need to use the word 'sticky' that is a CMS feature. Transposing web app features to class names is also a common mistake.
The semantic header inherits this context without the need to declare it specifically.

The real big disadvantage of this is that English is not everybody’s cup of tea but it seems to be better than the alternatives. Simplified Chinese as a second language perhaps? Perhaps not.

Once again, the main reason why I think class names should be in English is that HTML’s tag approach is English-based. English is the de-facto language for modern high-level programming and why meta systems like Microformats use a human-readable language as a universal way to catalogue and enrich content. In fact, effectively all metadata systems that tie content together are in English. There is no reason for the metadata not to be in English. To add fragmentation to this ecosystem by inserting an alternate linguistic layer is foolish because it immediately excludes the content from a larger pool of content on the Internet. You, in one stroke, would have abstained from adding value through richer context.

Class names are metadata and not content. Metadata conventions use English and so should your class names.

Egor Kloos is a UX consultant at Lunatech Research where he oversees design and front-end matters. This article was previously on dutchcelt.