When working with the DOM .nodeName property there are two hard-and-fast rules that most people abide by:
- The node names of HTML elements are always uppercase, even if they're explicitly created using lowercase characters.
<html>will result in a.nodeName === "HTML"(see the HTML 5 draft). - The node names of XML elements are always in the original case, as specified when they're created.
<data>will result in a.nodeName === "data",<DATA>will result in a.nodeName === "DATA".
Knowing these rules can be useful because it allows you to optimize your code. If you know that you're in an HTML document you can avoid having to upper/lowercase your .nodeName checks and you can just always assume that you're dealing with a .nodeName that's uppercase. This results in faster selectors for Internet Explorer and other minor optimizations.
However recently I've been running across two cases that've been especially problematic and have bucked the trend.
Importing Nodes from XML
The first is for browsers that support the adoptNode/importNode DOM methods. These methods allow you to move (or clone) a node from one DOM document to another. In this way you can move an XML node from an XML document and insert it into an HTML document. Normally this shouldn't matter much but, as it turns out, the original .nodeName case sensitivity is preserved from the original XML-ness of the node.
Thus if you have a lowercase XML element (<data>) and you use adoptNode or importNode to bring it into your HTML document the result will be .nodeName === "data" -- which completely bucks the trend for "all HTML element's node names are always uppercase." I consider this to be a bug, considering that the DOM element is now in an HTML document, not in an XML document, and should behave as such.
Unknown HTML 5 Elements
The second bit of weirdness comes from people attempting to use the new elements from HTML 5 in browsers that don't support it. Most browsers behave perfectly well when using some of the new HTML 5 elements (in that they don't freak out and support some level of styling). For Internet Explorer you must use the HTML 5 Shim technique - this will give unknown HTML 5 elements the ability to be styled and hold contents (such as a <section> element).
However there is an additional gotcha: When Internet Explorer encounters an element that it doesn't recognize it leaves the .nodeName in its original case. Thus if you have a <section> element in your HTML page the result will be .nodeName === "section" -- which directly contradicts the normal case sensitivity of the .nodeName property in HTML documents.
To try and understand all of this I made a bunch of test cases using a number of doctypes and document styles.
- HTML 5 document - uses the HTML 5 Doctype.
- XHTML document served as text/html.
- HTML document served with no doctype.
- XHTML document served with correct mimetype.
The important part of the test page is quite simple:
and the test cases are as follows:
HTML
Accesses the HTML elements that were originally included the page (should be case insensitive).
return document.getElementById("test").childNodes;
});
HTML createElement
Creates new DOM elements using the same document as the page in which it was shipped (should be case insensitive).
return [
document.createElement("div"),
document.createElement("DIV"),
document.createElement("section"),
document.createElement("SECTION")
];
});
innerHTML
Attempts to inject the elements using .innerHTML (should be case insensitive).
var test = document.getElementById("test");
test.innerHTML = "<div></div><DIV></DIV>" +
"<section></section><SECTION></SECTION>";
return test.childNodes;
});
For the remaining tests I grab a simple XML document:
<test>
<div></div><DIV></DIV>
<section></section><SECTION></SECTION>
</test>
like so:
new XMLHttpRequest() :
new ActiveXObject("Microsoft.XMLHTTP");
xhr.open("GET", "test.xml", false);
xhr.send(null);
var xml = xhr.responseXML;
XML
Test the elements in the XML document directly (should be case sensitive).
return xml.documentElement.childNodes;
});
XML createElement
Same as the HTML createElement but done using the XML document (should be case sensitive).
return [
xml.createElement("div"),
xml.createElement("DIV"),
xml.createElement("section"),
xml.createElement("SECTION")
];
});
HTML via importNode
This clones the nodes from the XML document, using importNode, and places them into the HTML document (should be case sensitive).
var test = document.getElementById("test");
while ( test.firstChild ) {
test.removeChild( test.firstChild );
}
var nodes = xml.documentElement.childNodes, node;
for ( var i = 0; i < nodes.length; i++ ) {
node = document.importNode( nodes[i], false );
test.appendChild( node );
}
return test.childNodes;
});
HTML via adoptNode
This moves the nodes from the XML document, using adoptNode, and places them into the HTML document (should be case sensitive).
var test = document.getElementById("test");
while ( test.firstChild ) {
test.removeChild( test.firstChild );
}
var nodes = xml.documentElement.childNodes, node;
while ( nodes.length ) {
node = document.adoptNode( nodes[0] );
test.appendChild( node );
}
return test.childNodes;
});
The Results
I ran the following tests in IE 6, IE 7, IE 8, Firefox 3.5, Safari 4.0.3, Chrome 3.0.195, and Opera 10.10. Additionally I tested against .tagName in addition to .nodeName and found no discernible difference (you can run your own .tagName tests by appending a ?tagName to any test URL like so.)
Note: The HTML 5, XHTML (served as HTML), and no-doctype pages all behaved identically to each other in every browser - thus I'm just going to not display the XHTML (as HTML) and no-doctype results as there wouldn't be anything interesting to show.
Firefox, Safari, and Chrome all yielded the same results here: Bringing in elements from an external document maintains the case sensitive nature of the .nodeName property - which is unexpected.
| <div> | <DIV> | <section> | <SECTION> | |
|---|---|---|---|---|
| HTML | DIV | DIV | SECTION | SECTION |
| HTML createElement | DIV | DIV | SECTION | SECTION |
| innerHTML | DIV | DIV | SECTION | SECTION |
| XML | div | DIV | section | SECTION |
| XML createElement | div | DIV | section | SECTION |
| HTML via importNode | div | DIV | section | SECTION |
| HTML via adoptNode | div | DIV | section | SECTION |
Internet Explorer fails in a different manner. To start, Internet Explorer doesn't support importNode or adoptNode so those particular tests simply don't run. However we can confirm that the case sensitivity of the unknown HTML 5 element is maintained in HTML, even though it shouldn't be.
| <div> | <DIV> | <section> | <SECTION> | |
|---|---|---|---|---|
| HTML | DIV | DIV | section | SECTION |
| HTML createElement | DIV | DIV | section | SECTION |
| innerHTML | DIV | DIV | section | SECTION |
| XML | div | DIV | section | SECTION |
| XML createElement | div | DIV | section | SECTION |
| HTML via importNode | Error: Object doesn't support this property or method | |||
| HTML via adoptNode | Error: Object doesn't support this property or method | |||
Opera ups the ante one further: Since it attempts to simultaneous follow web standards, and implement Internet Explorer's weird quirks, it both fails the importNode/adoptNode and the HTML 5 unknown element cases.
| <div> | <DIV> | <section> | <SECTION> | |
|---|---|---|---|---|
| HTML | DIV | DIV | section | SECTION |
| HTML createElement | DIV | DIV | section | SECTION |
| innerHTML | DIV | DIV | section | SECTION |
| XML | div | DIV | section | SECTION |
| XML createElement | div | DIV | section | SECTION |
| HTML via importNode | div | DIV | section | SECTION |
| HTML via adoptNode | div | DIV | section | SECTION |
XHTML (served with correct mimetype)
Nearly every browser that supported showing this page (Firefox, Safari, Opera, Chrome) displayed the same, expected, results:
| <div> | <DIV> | <section> | <SECTION> | |
|---|---|---|---|---|
| HTML | div | DIV | section | SECTION |
| HTML createElement | div | DIV | section | SECTION |
| innerHTML | div | DIV | section | SECTION |
| XML | div | DIV | section | SECTION |
| XML createElement | div | DIV | section | SECTION |
| HTML via importNode | div | DIV | section | SECTION |
| HTML via adoptNode | div | DIV | section | SECTION |
An XHTML page served properly is just an XML document - thus the case of elements is sensitive (as to be expected).
... except in Opera. Opera apparently will treat div elements case insensitively, when injected using .innerHTML, even if it's being served within an XHTML document.
| <div> | <DIV> | <section> | <SECTION> | |
|---|---|---|---|---|
| HTML | div | DIV | section | SECTION |
| HTML createElement | div | DIV | section | SECTION |
| innerHTML | DIV | DIV | section | SECTION |
| XML | div | DIV | section | SECTION |
| XML createElement | div | DIV | section | SECTION |
| HTML via importNode | div | DIV | section | SECTION |
| HTML via adoptNode | div | DIV | section | SECTION |
Conclusion
What can we learn from all of this? Unfortunately it appears as if we can't really trust our "trusted" rules about .nodeName case sensitivity for HTML documents. XML documents are completely safe and work as expected. XHTML (served with the correct mimetype) documents are nearly safe, save for the one bizarre Opera bug.
How will this change the code that we write? In short we can no longer trust the case insensitive nature of HTML documents - we need to assume that BOTH HTML and XML documents will be serving their content in a case sensitive nature - especially as more people start to adopt HTML 5 elements in their pages and expect some level of support in older browsers. This means that a number of selectors and DOM methods will take a performance hit as we can no longer take a case insensitive shortcut in our codebases.
There are a few outstanding jQuery tickets that are the result of these issues cropping up and now that I know the reasoning behind why they're happening I can now strip out all the case-insensitive performance improvements from the codebase - which is really quite unfortunate but at least it'll behave more consistently. I continue to stand by thesis from my earlier talk about the DOM: The DOM is a mess and every DOM method and property is broken in some way, in some browser.





Diego Perini (November 24, 2009 at 7:21 pm)
John here are two links that maybe will help in the process:
http://rakaz.nl/item/css_selector_bugs_case_sensitivity
the second one is the current HTML5 work:
http://www.whatwg.org/specs/web-apps/current-work/#selectors
you can find the table of the attributes that needs special handling.
You don't have all the point about why this is needed, in addition to what you said for example it is necessary for SVG to work in HTML documents to use the namespaced API. So createElementNS() is needed for an SVG to work in HTML document.
You can look in my repo for other needed references on this quite intricated matter.
Hope these additions will be useful for your project as they were in mine, the overhead is not that much as you said. More of a lazy thing.
Mike Taylor (November 24, 2009 at 8:10 pm)
So does this mean if I type in shouty caps Sizzle will be happy?
Mike Taylor (November 24, 2009 at 8:34 pm)
Re: my last comment (should have tested before I commented), it appears that Sizzle does in fact with with uppercase unknown tags. Here's a test page: http://miketaylr.com/test/html5pseudo_CAPS.html
Mook (November 24, 2009 at 9:00 pm)
But you're importing nodes from the null namespace, aren't you? What happens if your imported elements are in the XHTML namespace? What happens if they're SVG elements instead? If you called createElementNS instead of createElement?
I have no idea what the _right_ answers are, just what things might bring different answers. I was also under the impression that HTML5 has.. _something_ to do with namespaces, but I can't tell what section 9.3 is (it's missing). That might affect things as well?
Nate Cavanaugh (November 25, 2009 at 12:50 am)
Heya John, just out of curiosity what kind of performance difference is there? Is it a significant impact or is it under loads on edge cases?
And is it a change that memoization would help or are the property lookups more expensive than the toUpper/toLower methods are? Namely would caching based on the most common result (uppercase), while still allowing for safety for the edge cases help at all or is it not practically helpful in real performance tests?
Antonello Pasella (November 25, 2009 at 3:31 am)
Why not insert a new propery in jQuery.support or directly in Sizzle?
$.support.nodeNameUppercase = (document.createElement('fOo').nodeName == 'fOo');
Pete B (November 25, 2009 at 4:30 am)
For some reason I never trusted the nodeName property to always be the same case in all situations.
Neil Rashbrook (November 25, 2009 at 6:21 am)
If you create an XHTML div element in a XUL document then it is case sensitive, but if you import that node into an HTML document then it becomes upper case, the same as if you created it in the HTML document.
Antonello Pasella (November 25, 2009 at 7:06 am)
Correction
$.support.nodeNameUppercase = (document.createElement('fOo').nodeName === 'FOO');
zcorpan (November 25, 2009 at 9:18 am)
The spec says
"Element.tagName and Node.nodeName
These attributes must return element names converted to ASCII uppercase, regardless of the case with which they were created."
This means that a created in an XML document and moved to an HTML document, nodeName should return "DIV".
Also note "This does not apply to ... elements that are not in the HTML namespace despite being in HTML documents.", which means that and in HTML will have lowercase nodeName.