Lists all of the journal entries for the day.

Thu, 30 Sep 2010

2:23 PM - PHP is broken

http://www.phpwact.org/php/i18n/charsets

The latest in zend bugs.. strlen reports 27 characters for a 10 length string.

()

2:29 PM - How to test for UTF-8 characters

One of the problems on the web is all the different character encodings.  Computers represent information in different ways.  Some of these approaches handle multiple languages, others do not.  One such encoding is UTF-8.  You can test for UTF-8 in your web applications using this regular expression:

http://www.w3.org/International/questions/qa-forms-utf-8

 

$field =~
  m/A(
     [x09x0Ax0Dx20-x7E]            # ASCII
   | [xC2-xDF][x80-xBF]             # non-overlong 2-byte
   |  xE0[xA0-xBF][x80-xBF]        # excluding overlongs
   | [xE1-xECxEExEF][x80-xBF]{2}  # straight 3-byte
   |  xED[x80-x9F][x80-xBF]        # excluding surrogates
   |  xF0[x90-xBF][x80-xBF]{2}     # planes 1-3
   | [xF1-xF3][x80-xBF]{3}          # planes 4-15
   |  xF4[x80-x8F][x80-xBF]{2}     # plane 16
  )*z/x;

tags: encoding character

()