Archive for the ‘HTML’ Category

Why <b> and <i>?

September 29, 2012 Comments off

Common Web development wisdom states that the <b> and <i> tags, and the various other presentational font style elements, should be scrupulously avoided because they are purely presentational in nature and carry no semantic meaning. So imagine my surprise to learn that not only would they be in HTML 5, but they are standard already according to HTML 4.01 strict! (Of the font style elements, only <u> and <s> are deprecated.)

My question is, why? That’s not to say I disagree with the decision, but what’s the rationale?

TheWHATWG FAQ explains it thus:

The inclusion of these elements is a largely pragmatic decision based upon their widespread usage, and their usefulness for use cases which are not covered by more specific elements.

While there are a number of common use cases for italics which are covered by more specific elements, such as emphasis (em), citations (cite), definitions (dfn) and variables (var), there are many other use cases which are not covered well by these elements. For example, a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, or a ship name.

Similarly, although a number of common use cases for bold text are also covered by more specific elements such as strong emphasis (strong), headings (h1-h6) or table headers (th); there are others which are not, such as key words in a document abstract or product names in a review.

Some people argue that in such cases, the span element should be used with an appropriate class name and associated stylesheet. However, the b and i elements provide for a reasonable fallback styling in environments that don’t support stylesheets or which do not render visually, such as screen readers, and they also provide some indication that the text is somehow distinct from its surrounding content.

In essence, they convey distinct, though non-specific, semantics, which are to be determined by the reader in the context of their use. In other words, although they don’t convey specific semantics by themselves, they indicate that that the content is somehow distinct from its surroundings and leaves the interpretation of the semantics up to the reader.

That makes sense. In addition to the given examples, I imagine it would be appropriate to use <i> for scare italics to indicate use-mention distinction. There’s also a pretty strong case to be made that using <cite> to mark up a book title even when it’s not technically a citation is a misuse. (Case in point: MLA or APA citations, in which the whole entry is the citation, but only the title is italicized, and only for certain kinds of works.)

I disagree with some of the other examples, though. For instance, wouldn’t it be more appropriate to use <q> with a class name and some CSS rules for thoughts? (I assume that they’re referring to the way thoughts are often italicized in prose fiction to distinguish them from spoken dialogue.) Aren’t keywords (often rendered in bold or italics) a special class of emphasis? If not, WHATWG HTML uses the <mark> tag to denote “relevance” as distinct from “importance.”

Another suggested use of <b> in the WHATWG spec is as a lede. A lede is just the first sentence or paragraph, after all. That means that the semantic meaning (such as it is) of a lede is already defined just by where it is in the document. So why bother with <b> at all? In the case of highlighting only the first sentence rather than the whole paragraph, why choose <b class="lede"> over <span class="lede">? I can’t think of any reason but the presentational nature of <b>, because (again) what little semantic meaning a lede has is already implied by its position in the document. A CSS solution using the :first-child pseudo-class to bold the first <p> in an <article> or <section> would work fine, perhaps using something like <p class="lede"> as a fallback. (You could even use :first-line if you don’t mind cheating the definition of “lede&rdquo; a little bit. I don’t advocate cheating definitions, but I have seen printed works bold the first line rather than the lede, properly speaking.)

By far my biggest disagreement with WHATWG concerns this:

The problem with elements like <font> isn’t that they are presentational per se, it’s that they are media-dependent (they apply to visual browsers but not to speech browsers). While <b>, <i> and <small> historically have been presentational, they are defined in a media-independent manner in HTML5. For example, <small> corresponds to the really quickly spoken part at the end of radio advertisements.

First of all, it’s completely at odds with my understanding of the semantic Web. The way I learned it, HTML is supposed to describe content, not how that content should be displayed. Presentational information is the job of CSS, just as behavior is the job of JavaScript. Thus it actually is a problem that elements like <font> (and <b> and <i>) are presentational. However, maybe that’s just a difference of opinion between WHATWG and the sources from which I learned this attitude.

More importantly, I can see little difference between “presentational per se” and “media-dependent.” How something is presented depends on the medium in which it is presented. To illustrate, let’s look at that statement about <small>. You can’t have spoken words that are small, any more than you can have text (be it on screen or in print) that is spoken quickly, because by definition it is not spoken.

I object most strongly to the sentence before last: “[These tags] are defined in a media-independent manner in HTML5.” This misses the entire point. The <b> and <i> tags are explicitly named after bold and italics. That tag names are supposed to have meaning is clear from the existence of tags like <em>, <cite>, <address>, and <code>, or the introduction of tags like <article>, <section>, <nav>, and <header>. Granting the alleged distinction between presentation and media-dependence doesn’t help: They may claim that the tags are now media-independent, but defining in a spec that you can speak in italics doesn’t make it so.

Returning to the <small> example, it seems to me that what WHATWG seems to be after isn’t size, speed, or any other presentational attributes, but rather de-emphasis or unimportance. The legalese in ads is printed small or spoken quickly because it is less important than the rest of the ad copy. The point of the ad is that you should buy the product, not that conditions apply and results may vary, even though that information is important enough to be included. The <aside> element, styled appropriately, would work well here. (Admittedly, <aside> is not suited for parentheticals within the flow of the text, but the ad copy example doesn’t have that issue. Use of <small> for such parentheticals, though, seems to differ from using a <span> only in presentation, which—again—is a job for CSS.)

After all this huffing and puffing on my part, the bottom line is that tags like <i> should be kept around because there are good reasons for using them, but that doesn’t mean there aren’t also bad reasons.

So what should Web authors do? All I can offer is my opinion: Use tags with clear semantic meaning (like <em>) wherever possible, but don’t stretch their definitions just to avoid using a presentational tag. Use the presentational tags, with appropriate class names, when the rules of grammar and style that you follow prescribe that presentation for the kind of content you’re marking up and there’s not a more semantic tag that will work. This follows both the meaning given in the FAQ (namely that there is some semantic difference from the surrounding text) and the literal meaning of the tag name (namely that the text ought to be displayed in, for instance, italics). It’s basically the same principle as using quotation mark characters rather than <q> for something that is supposed to be enclosed in quotation marks but isn’t a quotation. Again, this is just my opinion, and I’m not aware of whether it conflicts with any established best practices, but so far it seems like a good approach.

Categories: HTML Tags: ,

Is Nofollow Standard?

August 22, 2012 Comments off

For some reason, I recently wondered if I should be annoyed that Web site owners are generally expected to include the non-standard rel="nofollow" attribute in links in their HTML. The problem with this thought is that calling the attribute “non-standard” is not particularly accurate. It depends on exactly what you mean by “non-standard.”

First, to be absolutely clear, adding rel="nofollow" to links is perfectly acceptable according to the HTML 4.01 standard. It passes the W3C’s validator. Frankly, though, I doubt that many people would lose sleep if it didn’t validate.

With that said, let me explain what I mean by “It depends.” When I first turned to the spec to fact-check myself, I came across the list of link types (which are the acceptable values for the rel attribute in a link), and “nofollow” isn’t on it. The “nofollow” value does appear in the spec (Scroll to “Robots and the Meta Element”), but in the context of , not the rel attribute. Besides, that section is “informative, not normative.”

On the other hand, there is nothing in the spec that prohibits Web developers from creating their own link types. (Whether it’s actually a good idea is another matter, which isn’t relevant here due to the widespread adoption of nofollow.) So anyone can use rel="nofollow". But there’s a catch. Here’s what the section on link types has to say:

Authors may wish to define additional link types not described in this specification. If they do so, they should use a profile to cite the conventions used to define the link types. Please see the profile attribute of the HEAD element for more details.

I have to admit that this is the first I’ve even heard of the profile attribute. Is this just a case of my own ignorance? Well, yes, but then this isn’t the most common attribute. I didn’t see it on Wikipedia, even though Wikipedia uses nofollow. My Blogger blog doesn’t use it, either. (I also didn’t see it on a handful of other high-profile sites I checked, but to my surprise they didn’t use nofollow, either, so I won’t bother listing them.) I did notice it on my WordPress blog, though. But not so fast: In that last case, the profile attribute points to, which does not list nofollow.
This isn’t necessarily a problem. The above quotation from the spec says that authors “should use a profile,” and in this context, “should” has a specific meaning. The spec says:

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC2119]. However, for readability, these words do not appear in all uppercase letters in this specification.

RFC 2119, in turn, has this to say:

3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

While I’m skeptical as to whether most site owners have actually “carefully weighed” the “full implications” of using nofollow without an accompanying profile, It’s not really that important in the scheme of things. The point is that there are “valid reasons” to use link types, and that means the spec clearly allows nofollow to be implemented in exactly the way I have observed.

All this is just a long-winded way of saying that nofollow is perfectly acceptable according to the standard, and only “non-standard” in the sense of not being explicitly included in the standard itself. The standard absolutely allows for it.

(It’s also in the W3C’s draft HTML5 spec and in WHATWG’s Living Standard, so if you’re the HTML5 type, rel="nofolow" is definitely standard. For now.)

The moral of the story, if there is one, is “Always check your facts before you complain about something.”

Categories: HTML

Table-less Forms: They Are Really That Hard.

September 7, 2011 Comments off

A couple weeks ago, I posted a list of criteria for what I’d like to see in a CSS-based form that emulated the nifty features that tables provided. To recap, I wanted:

  • Labels lined up in the same row as form fields
  • Label column that stretches/shrinks to the size of the longest label
  • As little extra HTML as possible

The good news: I quickly came up with a solution that met all my criteria. The bad news: It only works with simple forms. (“I quickly came up with a solution” should have been a tip-off!) First, I’ll explain what I did, and then I’ll explain why it can’t handle much complexity.
My solution was to nest the inputs I wanted to align inside a div, style the div so it would shrink to the width of its contents, and then use absolute positioning on the inputs (nested inside relative-positioned s) to push them to the left by 100% of the width of the div.

If it sounds too easy, that’s because it is. Here’s what doesn’t work:

  1. It only works with inputs that are the same height as the labels, which means it won’t work with s, multiple-row s, and probably others I’m forgetting.
  2. Checkboxes and radio buttons inside the wrapper div can’t be put before their labels because the positioning automatically puts the inputs on the right.
  3. Longer labels, such as you might need to ue for checkboxes and radio buttons, will stretch the wrapper
    ‘s width.
  4. Those last two aren’t problematic if you can group all your text inputs together and save everything else for the end. If you want to put anything between two groups of text inputs, though, you’ll need two separate wrapper
    s, which won’t line up with each other.

I will post the code as soon as I can get it cleaned up and looking nice. (My test code is embarrassingly messy.) I also plan to revisit this project and see what fixes I can make. The checkbox problem at number 2 shouldn’t be too hard to solve. Then again, that’s what I said about CSS forms in general.

Categories: HTML

Table-less Forms: Are They Really That Hard?

August 24, 2011 Comments off

I probably don’t need to explain the love-hate (and in many cases, just plain hate) relationship Web designers have with tables. If you don’t believe me, a Google search turned up this lovely presentation on presentational tables that explains the issues in a much less hot-headed fashion than what I’ve typically seen before. What I usually read boils down to “Tables are evil, and you should never use them.”

Naturally, this presents a few problems: Some things that are easy to do with tables and presentational HTML are much harder to do with CSS and purely semantic markup. For instance, the A List Apart article, Practical CSS Layout Tips, Tricks, & Techniques outlines a few of them and offers some solutions. There’s just one thing that I have yet to see done to my satisfaction: Forms.

Every technique I’ve ever seen for table-less forms has the problem of sacrificing some of the functionality you get with a table. Sure, they can put labels and fields on the same row and even make sure the fields all line up together. However, most of them use either a fixed width or a percentage width for the label column. A table, on the other hand, will resize the columns to fit the content.

Worse still, some of the supposedly more semantic approaches introduce extra markup. The A List Apart article linked above gives us this mess:

<div style="width: 350px; background-color: #cc9;
border: 1px dotted #333; padding: 5px;
margin: 0px auto";>



Shoe size:<span


Go ahead - write something...


Admittedly, that article is ten years old, which probably explains why it’s a rather extreme example. But come now, rather than just ? Inline styles? More importantly,

? Isn’t that awfully similar, structurally speaking, to a table? I thought part of the objective was to trim the fat from the code, and this doesn’t have too many fewer tags than a table. To be fair, a more recent article from 2006 does a better job (though I’m not sure why the author chose an unordered list), but still doesn’t solve the width problem.

Now some people will tell you that there’s no problem with using tables to lay out forms. After all, they really are tabular: Labels get one column, form fields get another. The label refers to the form field on the same row. Besides, if you look at a real form, i.e. one made of ink and paper, it looks like a table. Besides, they’re quick and simple to create, which is no small consideration when time is a factor.

I agree, to a point. I’ve certainly used tables to line up forms before, and I feel no remorse. Even so, it’s not a perfect solution. For one thing, while you can make a case that forms are tabular enough to excuse the use of tables, that isn’t really the most semantic way to do it. For another, you’re still stuck with table markup and form markup.

What I really want is a method that will line everything up the way a table does, stretch or shrink the label column to fit the text, and minimize the extra HTML. To reiterate, I have never seen a solution that does this, though that may very well be because I haven’t looked very hard. It seems like it shouldn’t be too difficult to create such a solution.

So I think I will. Stay tuned.

Categories: HTML