Best undefined-behavior questions in February 2011

Undefined behaviour in (X)HTML?

20 votes

I know this question is pretty much asking for downvotes, but...

Is there such a thing as undefined behaviour in (X)HTML?

I have wondered this after playing around with the <button> tag, which allows HTML to be rendered as button. Nothing new so far...

But I noticed that one can also use the <a> tag. Complete example:

<button>
    normal text
    <b>bold text</b>
    <a href="http://www.example.com/">linked text</a>
</button>

This is rendered as following on Firefox:

And in Google Chrome:

Now, On firefox, the link is NOT clickable, only the button... However, on Chrome, the link is clickable, and will redirect to the IANA RFC2606 Page.

Is this undefined behaviour? Are there more cases in (X)HTML that could be described as undefined behaviour?

EDIT:
I welcome more opinions! :-)

It's a little more complex than just inspecting the DTD as given by Yi Jiang and mu is too short.

It's true that the XHTML 1.0 DTDs explicitly forbid <a> elements as children of <button> elements as given in your question. However it does not forbid <a> elements as descendants of <button> elements.

So

<button>
    normal text
    <b>bold text</b>
    <span><a href="http://www.example.com/">linked text</a></span>
</button>

is XHTML 1.0 Strict DTD conforming. But it has the same behavioural difference between Firefox and Chrome as the button fragment in the question.

Now, it is known that DTDs have problems describing limitations on descendant relationships, so it's maybe not surprising that the above sample is DTD conforming.

However. Appendix B of the XHTML 1.0 spec normatively describes descendant limitations in addition to the DTD. It says:

The following elements have prohibitions on which elements they can contain (see SGML Exclusions). This prohibition applies to all depths of nesting, i.e. it contains all the descendant elements.

button
must not contain the input, select, textarea, label, button, form, fieldset, iframe or isindex elements.

Note that it does not contain an exclusion for the <a> element. So it seems that XHTML 1.0 does not prohibit the <a> element from being non-child descendant of <button> and the behaviour in this case is indeed undefined.

This omission is almost certainly a mistake. The <a> element should have been in the list of elements prohibited as descendants of button in Appendix B.

HTML5 (including XHTML5) is much more thorough on the matter. It says:

4.10.8 The button element

Content model: Phrasing content, but there must be no interactive content descendant.

where interactive content is defined as

Interactive content is content that is specifically intended for user interaction.

  • a
  • audio (if the controls attribute is present)
  • button
  • details
  • embed
  • iframe
  • img (if the usemap attribute is present)
  • input (if the type attribute is not in the Hidden state)
  • keygen
  • label
  • menu (if the type attribute is in the toolbar state)
  • object (if the usemap attribute is present)
  • select
  • textarea
  • video (if the controls attribute is present)

So in (X)HTML5 the <a> element is prohibited from being a descendant of the <button> element.

sizeof(""+0) != sizeof(char *) Bug or undefined behaviour?

13 votes

The following C program:

#include <stdio.h>

int main(void)
{
    printf("%u %u %u\n",sizeof "",sizeof(""+0),sizeof(char *));
    return 0;
}

outputs 1 4 4 when compiled with GCC on Linux, but outputs 1 1 4 when compiled with Microsoft Visual C++ on Windows. The GCC result is what I would expect. Do they differ because MSVC has a bug or because sizeof(""+0) is undefined? For both compilers the behaviour (i.e. whether the middle value printed is equal to the first value or the last value) is the same no matter what string literal or integer constant you use.

A relevant reference in the ANSI C Standard seems to be 6.2.2.1 - Lvalues and function designators:

"Except when it is the operand of the sizeof operator ... an lvalue that has type 'array of type' is converted to an expression that has type 'pointer to type' that points to the initial element of the array object and is not an lvalue".

Here though the "Except" should not apply because in sizeof(""+0) the array/string literal is an operand of + not sizeof.

Because "fooabc" is of type char[7], sizeof("fooabc") yields the same as sizeof(char[7]). However, arrays can be implicitly converted - the part you quoted - to pointers (some people wrongly call this "decay"), and since this is necessary for the arithmetic (+) to work, ""+0 will have a type of char*. And a char pointer can have a different size than the array. In that regard, MSVC's behavior seems broken.