Surprise!

Submitted by michael on Wed, 12/14/2011 - 05:46
POP Quiz time! How do these three bits of HTML get parsed?
  1. <foo bar=baz/> (no space between the value and the slash
  2. <foo bar=baz /> (space after the unquoted value)
  3. <foo bar="baz"/> (quoted attribute value)
The value of the bar attribute in the first example is literally “baz/” with the slash, and no errors or warnings are generated. The second and third ones are parsed out as “baz” as most people would probably expect.

TILs

Submitted by michael on Fri, 12/02/2011 - 03:43
  1. Each Perl module file needs to declare pragmas at the top of the file for them to have any effect. This is irritating. All this time I thought I was coding in modern perl, with strict and warnings enabled. Adding those things back has been a PITA. At least this time I was smart enough to build a test suite.
  2. Test::More doesn't handle UTF8 well at all. Because of how it duplicates the standard file handles, you can't easily change STDOUT etc to handle UTF8.

xquery madness

Submitted by michael on Wed, 11/02/2011 - 19:31
Maybe I don't understand xQuery enough yet. I've only been working with it for a little while, and it's a very strange, arcane language. But I don't understand why these two functions return different results. The text between the smiley parentheses should be a comment, ignored by the parser and not output. But it isn't ignored.