Minimalist Semantic Markup, Where it's at...

R. Lee Ermey Suck it up Princess!!!

It often seems that people confuse minimalism and being concise with being needlessly cryptic, and then obsessing over keeping whitespace and comments tiny -- or worse, stripping them out entirely! You see it time after time where people will slap dozens of gibberish near impossible to decipher classes on things, and then "minify" their code by stripping out comments and whitespace to try and sweep their rubbish code under the carpet.

When I say minimalist, that's NOT what I mean. The base concepts of minimalist semantic markup works hand in hand with progressive enhancement, and is in fact and essential part of the second and third steps that article outlines.

The core concept is to leverage semantic tags so you need less classes. To use inheritance and the cascading part of cascading style sheets to make it so you aren't throwing new classes in there just because you want a different behavior. It is about using the existing semantic tags as much as possible BEFORE you add DIV or SPAN keeping the total number of elements to a minimum. It's about removing things that don't actually serve a legitimate purpose like unneccessary META. It's about not stuffing certain tags full of keywords -- which ends up not only being bloat, but can also get you slapped down by search for abuse!

Bad advice from "major" sources, don't let their tools make you the tool!

Right now there's a lot of bad advice in regards to the use of selectors! Said nonsense is being promoted as ways to make sites "faster" -- when generally speaking what is suggested has the exact opposite affect. Tools like Google PageSpeed and CSSLint have become outright gibberish saying absolutely idiotic things like Don't use tags as selectors... Is it any wonder we then see mouth-breathing dumbass nonsense vomited up by Turdpress developers like this:

<div class="menu-bar">
               <ul class="menu">
<div class="menu-main-container"><ul id="menu-main" class="menu"><li id="menu-item-191" class="menu-item menu-item-type-post_type menu-item-object-page current-menu-item page_item page-item-189 current_page_item menu-item-191">

... and then these fools have the unmitigated gall to say that's going to be faster or easier than having this markup:

<ul id="mainMenu">
	<li>

To put all the styling on the UL or LI, using the ID on the UL and Joe forbid do "#mainMenu li" in the CSS? Sorry, but that's some herpa-freaking-derp drooling on the bib idiocy right there!

George Carlin used to have a joke about abortion, "Not every ejaculation deserves a name". In that way, not every tag needs a DIV around it, a SPAN inside it, or five dozen classes on it!

The end result of such bad practices typically being using three to five times the HTML needed, using more CSS than needed since the OOCSS style halfwits make "a class for everything" -- to the point where people end up wasting dozens if not hundreds of K of markup on doing 20k or less' job! THEN they run around bragging about how whitespace stripping (also known as minification) took 2k off it?

So that minimalism I'm talking about? It means using classes and ID's when you need to, or to wrap multiple elements so none of them need classes. It means styling the semantic tags by their tag name instead of throwing classes at them. It means not adding semantically neutral tags like DIV or SPAN until you've expended what is practical to do with the existing tags.

Take this typical example of garbage code you'll see people vomit up all the time:

<div id="mainMenu">
	<ul class="menuDepth1">
		<li class="menuTopItem"><a href="#" class="menuTopAnchor">Home</a></li>
		<li class="menuTopItem hasChildren">
			<span class="menuTopSpan hasChildren">Menu Section</span>
			<ul class="menuDepth2">
				<li class="menuLowerItem">
					<a href="#" class="menuLowerA">Item 1</a>
				</li>
				<li class="menuLowerItem">
					<a href="#" class="menuLowerA">Item 2</a>
				</li>
				<li class="menuLowerItem">
					<a href="#" class="menuLowerA">Item 3</a>
				</li>
				<li class="menuLowerItem">
					<a href="#" class="menuLowerA">Item 4</a>
				</li>
			</ul>
		</li>
	</ul>
</div>

That there are people out there who see nothing wrong with that is mind-boggling. I start out with a pretty low opinion of others; don't need that opinion lowered even further. There is rarely a legitimate excuse when all child items get the same class for any of them to have a class; it's VERY unlikely that anything is being done to that outer DIV that couldn't be done to the UL. (there ARE cases where it serves a purpose, usually when you see this crap it doesn't) -- basically there is no legitimate reason for that entire mess to be much more than:

<ul id="mainMenu">
	<li><a href="#">Home</a></li>
	<li>
		<span>Menu Section</span>
		<ul>
			<li><a href="#">Item 1</a></li>
			<li><a href="#">Item 2</a></li>
			<li><a href="#">Item 3</a></li>
			<li><a href="#">Item 4</a></li>
		</ul>
	</li>
</ul>

... apart from utter and complete developer ineptitude and ignorance. Anything you can do to a DIV you can do to a UL. Since that UL has a perfectly good id, "menuTopItem" is just "#mainMenu li", "menuTopAnchor" is just "#mainMenu li a", etc, etc, down the line.

For some bizarre reason tools like CSSLint are making up claims about how tag selectors are "slower", at the very least slow enough to negatively impact render time. This is ignorant nonsense when a 133mhz Pentium 1 running IE 5 could handle them in adequate time! More so given that today 1ghz is slow for a handheld! It reminds me of the claim made by the anti-table zealots turned "don't use tables for layout" into "never use tables" because of some bullshit claim about render speed -- when again if a 386/40 running Windows 3.1 and IE4 could handle it... where in blazes did this ridiculous claim come from?

In both cases it's a tissue of lies easily dismissed if you have even the slightest inkling how any of this stuff actually works. The ONLY reason I could see it having enough of a render speed impact is if you have five or ten times as many elements as needed on the page... and even then what's going to have a bigger impact on rendering? More markup to transfer and parse with MORE rules in the form of classes, or applying CSS rules to that markup after the DOM is built using the elements normal names?

That tools like Google PageSpeed or CSSLint could even make such claims and offer such bad advice shows that whoever is running the show at these places aren't qualified to open their traps on the subject!

Which of course is why if you use the tags for what they mean, style the tags as much as you can BEFORE diving for wrapping elements like DIV, don't waste time with the allegedly semantic pointless idiotic redundant HTML 5 crap like SECTION, ARTICLE, NAV, HEADER and FOOTER instead allowing H1..H6 and HR to do their jobs, you can end up with a fraction the markup and CSS without the need to waste time on idiocy like code minification / whitespace stripping!

Pointless META and attribute bloat

Equally bad is the sheer amount of misuse, abuse, and gibberish to be found in various things like metaData. Simply put MOST of the garbage META people slap in the HEAD of their document serves no legitimate purpose to 99.9999999999% of visitors to your site... much of it, like the majority of the "OpenGraph" nonsense serves ZERO legitimate purpose apart from code bloat. If you are already putting in perfectly good description and keywords META, and a normal TITLE tag, why the hell would you replicate the exact same data in og:description, og:keywords and og:title?!?

I'm not saying all of OpenGraph is rubbish, og:image for example is handy extra information the normal markup can't provide, but them expecting us to double-down on the same information? That's when we REALLY need to tell those out there telling us to do this to go plow themselves!

Even the good META are ridiculously misunderstood and slapped together out of ignorance, typically just because of people copying what others are doing without even asking "is that actually the right way?"

Take the <meta name="description">. It exists for the sole purpose of being text shown below your link on a SERP. That's it, that's what its entire reason for existing is. It's NOT a place to blindly stuff keywords, it should be an interesting natural language sentence or two to draw users in. As Matt Cutts told us repeatedly, Write for the user, not the search engine!. To that end it is pointless to make it too long, and I advise when possible keeping it under 160 characters. Few search engines will look at anything past that limit, much less make room to show it on the results page.

... or how about the <meta name="keywords">? This one is ALWAYS stuffed to the gills with so much crap it's mind-numbingly stupid. The ENTIRE reason this tag exists is to be 7 or 8 single words or proper names, that exist between <body> and <\body> that you want a slight uprank on... preferably 128 characters or less; many sources like SEOWorkers suggest 96 characters or less. If you exceed these limitations it WILL be ignored. I cover this more in section 2.4 of my "What's wrong with YOUR website" article.

Even the blasted TITLE tag is just filled to the brim with rubbish by so many people who are ignorant of what it is for. Again, see my "What's wrong with YOUR website" article under Section 2.6, Useless TITLE tag

Another gem is the robots META -- while certainly noindex and nofollow have their place, *NEWS FLASH FOLKS* There is no such thing as 'index' or 'follow'!!!. You want it to be indexed or followed, leave out the META altogether! That's it. Those values are gibberish and COULD be mistaken for their "no" equivalents.

... and that's before we talk about all the pointless garbage NOTHING cares about like "author", "copyright", "rating". Just get rid of them, they serve zero legitimate purpose apart from wasting bandwidth.

To LINK or not to LINK?

While it can be tempting to fill up on LINK tags and rel attributes, you have to be careful not to overdo it. One of the most misused is rel="next" and rel="previous" as you'll often see people assign that to multiple link or anchors. Big tip? You can only have one of each work at a time as they exist to point at the next or previous page in a multi-page article. This is so you can hit "forwards" or "back" if the history is empty and still get to those pages. As such across browsers the behavior of having multiple instances can be... inconsistent. Some browsers will only obey the first one found, others only obey the last one declared. Much like ID's if you're gonna use it, you only get to use it ONCE, so choose wisely!!!

In that same way it's easy to go overboard on things nobody uses. RSS is a classic example of that where you'll see people having three to six different feeds for the same data. This is 2015, I think you're safe with just RSS 2.x, since few clients even notice a difference anymore and support all the formats. Well, apart from the Atom holdouts who would never notice since even their clients support regular RSS anyways! Sorry Mark.

TITLE attribute, what are you smoking?

Far too often you'll see code where someone puts title="something" on an anchor or other tag that is identical to the contents of the tag. The only legitimate usage scenario for that was the now defunct (with classic Opera) accesskeys menu so as to show text instead of the URI. That means today, there is ZERO reason to do this. This may be a gross oversimplification, but generally if you need to put title="" on anything that's not a ACRONYM, ABBR, or LINK you are likely doing so for no reason. The same could be said of anchors, but there are some cases where you may want to provide more information than the contents (like an image) can provide.

But if you have text inside the element that's the same as your title="" attribute, DON'T WASTE CODE PUTTING TITLE ON IT!!!.

The value attribute works the same way on OPTION tags. If your value is identical to the content of the OPTION, don't waste effort and bandwidth on declaring the value!

Presentation - NOT!

Generally speaking presentation in ANY form has ZERO business in your HTML. This is another key feature of using minimalist semantic markup. If you have a logo that represents text like a company name, you put the text in the HTML and the damned image in the CSS. If the image is NOT part of the content, it has ZERO business in a IMG tag!!! This practice can pay other dividends like better use of caching models since CSS is cached while markup on a new page using the same images is not. Likewise it means that should you want to reskin the site you don't have to change the HTML, very important if you are building a responsive layout like you should be given this is 2015!

In that same way, if you are using the <style> tag, you are writing garbage. Don't even THINK about using it as you are missing a caching opportunity, or even a pre-caching one. To that end I also recommend monolithic stylesheets so you can pre-cache the appearance of sub-pages -- but I do so because I see zero legitimate reason for any website to have more than 48k of CSS of it's own (not counting things like social plugins or adverts) apart from developer ineptitude. The style="" attribute should also be avoided and only used in the rarest of corner cases where the value being set conveys meaning. Examples of such exceptions include the width of a percentage bar on a graph or the text-size in a tag cloud. Apart from those VERY rare cases, just say not to "style" in your markup!

This also means that MOST of those things we were told to stop doing in 1998 with HTML 4 Strict? STOP DOING THEM!!!. That means no align, border, font, target or any of the dozens of other bits of outright silly nonsense that had NO business existing in HTML in the first place!

Again though, there are exceptions like when some drop-in code you 'have' to use like a Facebook like button was written by people who don't subscribe to the same philosophy, making you resort to things like align="center" like it's still 1997. What was that I was saying about developers having their head wedged so far up 1997's backside it could floss with their hair?

Still, you see our site logo? The scissors separating HTML from CSS? That's the ideal. Your content and tags saying what things ARE in your HTML. The presentation -- what it looks like -- in your EXTERNAL CSS file(s). Try not to mix them! This isn't chocolate and peanut butter. Cut the cord, your sites, any poor sod trying to maintain them, and the visitors to your sites will thank you for it.

A final bit of the minimalism is to understand how to form URI's properly. Time and time again you'll see developers wasting time stating the full URI in their anchors for pages on the same domain as the current page! There is NEVER a legitimate reason to do this and it's a flat out ignorant waste of bandwidth.

For the most part, you can always assume that "/" at the start of a URI will point at the host of the domain. For example on this page you are viewing right now <a href="/links">links</a>will always resolve to http://www.cutcodedown.com/links -- there is NO reason to state the full URI in that anchor. In that same way, relative links in the HTML and CSS are based on the current directiory.

The only scenario in which this could become broken is when you want to use relative links but are using "friendly links" like the ones on this site. For example if you were in http://www.cutcodedown.com/links and tried to access <a href="images">images</a> without that leading slash, it would take you to http://www.cutcodedown.com/links/images (which would 404) and not the root images directory.

The solution in that case is a handy little tag called BASE. For example if you declared <base href="http://www.cutcodedown.com/"> all links without leading slashes or a host declaration will base off that regardless of what the full URL of the page is. Laughably some fools want to get rid of that tag as "pointless" rolls eyes

Ideal Code Sizes

Once you have the methodology down pat, generally speaking a "normal" page on a website should only need so much markup. The amount it should have is based on the amount of plaintext, number of form elements (INPUT, OPTION, TEXTAREA, and BUTTON), number of media embeds (IMG, AUDIO, VIDEO, and OBJECT) and so forth to be found on a page. A reasonable guesstimate of total markup size I use is this:

  Expected HTML size in bytes = 2048 +
                     plaintext * 1.5 +
        content media elements * 256 +
                 form elements * 128 +
                       anchors * 128

That's the IDEAL. A well written page should try to at least come close to that number, and if you are under it you are golden -- and that's WITHOUT whitespace stripping / minification! If you exceed it by more than double, the page is probably utter and complete rubbish...

Also when I say content media, refer above to where I'm talking about images in the CSS vs. the markup.

For example let's run the page on this site for The first section of "What's Wrong With Your Website" though these numbers. It has 10,707 bytes of plaintext, only one content image, zero form elements (since the disqus part is JS added), and let's say two dozen or so anchors (ballpark guesstimate).

Size Calculation for
"What's wrong with your website, Part 1"
Size Multiplier Total
Ideal HTML Size (in bytes): 21,437
Overhead 2,048
Plaintext 10,707 1.5 16,061
Media Elements 1 256 256
Anchors 24 128 3,072

... and how big is the actual page's markup? 17,906 bytes. WELL under our ideal minimum. That's without whitespace stripping/minification, and including full formatting, comments on closures, and even the only opengraph tag worth actually using!

THAT's what I'm talking about when I say minimalist! Your normal plain-Jane boring turdpress template from one of the nube predating scam artist whorehouses like ThemeForest or TemplateMonster? Triple that number without even batting an eyelash. Mix in idiocy like bootstrap? Double it again!

End result being people wasting dozens or even hundreds of K of HTML on 8 to 32k's job! THEN they have the brass stones to call their bloated harder to deal with nonsense "easier". A delusion based almost entirely in apathy, ignorance, and wishful thinking.

Projects

  • elementals.js
    A lightweight JavaScript library focusing on cross browser support, ECMAScript polyfills, and DOM manipulation.
  • eFlipper.js
    An image carousel script using elementals.js
  • eProgress.js
    A JavaScript controllable progress bar using elementals.js. Based on the nProgress project that relies on the much heavier jQuery library.

/for_others

Browse code samples of people I've helped on various forums. These code snippets, images, and full rewrites of websites date back a decade or more, and are organized by the forum username of who I was helping. You'll find all sorts of oddball bits and pieces in here. You find any of it useful, go ahead, pick up the ball, and run with it.