Guest Post : PHP’s Remarkable Hexadecimals

August 23, 2013

Articles, PHP

Guest Post by Sharon Lee Levy, ZCE

Intro

During the past year, I have been giving a presentation whose title varies between “PHP: Numerics and Wizardry” and “PHP: Quirks, Gotchas & Wizardry”

Both talks discuss among other topics, hexadecimal support in PHP.  While it is true that the use cases for hexadecimals are less frequent now than in the past, still you if you were to encounter a hexadecimal on your next visit to PHPland, it’s good to have a realistic expectation as to what to expect from this numerical representation, whose complexity is less than apparent at first glance. But, how complex can this subject really be when it deals essentially with base 16? Let’s find out!

There is one truth about hexadecimals that informs a visitor to the land where blue, green and other colorful elePHPants play, that you’ve entered a different world.  Hexadecimal whole numbers are fine as long as they do not exceed 253 – the maximum integer value that can be stored as a floating point value.  But, hexadecimal fractions are unsupported despite being acceptable entities in the realm of math.  Apparently, the law of supply and demand is at work as PHP continues to evolve.  Since few users have need for such fractions in PHP, the language economizes by striving to adhere to what is really useful for web applications.  And, to make matters even more interesting, there’s an added twist.  A hexadecimal in PHP may itself represent a decimal integer or a floating point value!

So 0.2HEX is without any representation in PHP as a hexadecimal fraction.  But, you may express its decimal value in PHP, as follows:

<?php

echo 2 * pow(16,-1);

And, 0xFFFFFFFF may refer to a negative integer or a positive floating point value in PHP.

Only in PHPland

PHP provides two functions dechex and hexdec to make it easy for users to convert a number back and forth between decimal to hexadecimal.  The online manual makes the following baffling statement about the first function:

“As PHP’s integer type is signed, but dechex() deals with unsigned integers, negative integers will be treated as though they were unsigned.” (see http://us1.php.net/manual/en/function.dechex.php).  It appears that dechex must pull off an unimaginable feat and work with a non-existent data type in PHP – clearly a ludicrous predicament.

Although PHP lacks unsigned numbers, the underlying foundation of PHP is predominantly the C Programming Language, which possesses a variety of data types, including, unsigned integers, long and short.  PHP’s integer to date has been a signed long integer, although there’s been talk on the PHP Internals List of late about bringing a 64-bit int to PHP.  The current situation could benefit from a 64-bit int because the number of bits comprising a long integer differs according to platform, causing it to become an entity of relative value in PHP.

The range of signed long integers on a 64 bit platform is
-9223372036854775808 to 9223372036854775807, otherwise the range is dramatically smaller, -2147483648 to 2147483647.  As an aside, note these ranges are asymmetrical with respect to zero.  Instead of being in the middle of each range, zero heads the range of positive signed integers while negative one starts the negative range of values.

The difference between the two ranges is due to the size of a long which is 32-bits unless it resides on a 64 bit platform where its width is doubled.  But what does this have to do with dechex, you may well wonder?  In order for dechex to return a meaningful hexadecimal string it needs to use positive values appropriate for either a 64 bit or 32 bit platform.  These are the ranges of values that the function requires:

32bit Integer:  0 to 4294967295

64bit Integer:  0 to 18446744073709551615

Looking at these positive values, one might misleadingly conclude that PHP supports unsigned integers.  Nothing could be further from the truth.  PHP does not support unsigned integers.  PHP deftly finesses the impossible, by having the positive signed integers form the bottom half of each range.  What’s left over?  The negative numbers which will be transformed into positive floating point values.  How does this happen and is this some kind of magic?  I prefer to label it wizardry.

To appreciate the seemingly magical transformation of the negative numbers into positive doubles, requires examining the source code for dechex.  If you review, the latest code in PHP5.5 (see http://lxr.php.net/xref/PHP_5_5/ext/standard/math.c#1047), you will note that after variable declarations, parameter parsing and promoting any non-integer argument to a long, that _php_math_longtobase() executes.  This function takes two parameters, namely the numeric value to convert and a base which appropriately enough is 16 in this case. Let’s inspect this function (see http://lxr.php.net/xref/PHP_5_5/ext/standard/math.c#_php_math_longtobase):

PHPAPI char * _php_math_longtobase(zval *arg, int base)
{
    static char digits[] = "0123456789abcdefghijklmnopqrstuvwxyz";
    char buf[(sizeof(unsigned long) << 3) + 1];
    char *ptr, *end;
    unsigned long value;
    [snipped …]
    do {
        *--ptr = digits[value % base];
       value /= base;
    } while (ptr > buf && value);
    return [snipped ..]
}


This code comes to PHP’s aid by providing precisely what PHP lacks, unsigned integer support. Note the highlighted variable declaration for
value which is an unsigned long. So, if you were to pass a -1 to dexhex, that negative value will find its way internally to the C function _php_math_longtobase which assigns it to variable value. Since that variable is unsigned the result will be a positive number of either 4294967295 or 18446744073709551615, depending on your platform’s unsigned long integer size.

Complementary Phun

Most likely, the outcome results from the application of the 2s Complement method. The basic idea behind this method is that both negative and positive numbers should be stored in a binary format on modern computer systems in a complementary fashion.  The number one in binary on a 32 bit system is:

00000000000000000000000000000001

A negative one needs to have a distinct binary bit pattern in order to distinguish it from positive one.  According to the 2s Complement method, two things need to happen to store a negative integer, namely invert all the bits and add one to the result, as follows:

Step One:  Bit Inversion

Bit inversion can be easily achieved using the bitwise inverse (~) operator.

(~00000000000000000000000000000001)
___________________________________

11111111111111111111111111111110

Step Two: Add One

11111111111111111111111111111110
+                                                            1
________________________________
11111111111111111111111111111111

Of course, this is not something that you as a PHP developer would ordinarily need to do – this 2s Complement method generally occurs behind the scenes, since it pertains to the way a compiler encodes a negative number in a binary format.
What’s great about the 2s Complement method is that the binary forms of positive one and negative one are distinct from each other as follows:

00000000000000000000000000000001 (one)

11111111111111111111111111111111 (negative one)

But, note that there can be ambiguity regarding the binary bit pattern for negative one.  The same bit pattern applies for the integer value of 4294967295.  What influences how this numeric representation evaluates is if the leftmost bit does double-duty by also serving as the sign of a number.  An unsigned integer in C does not have a signed bit so when -1 is assigned to an unsigned 32-bit integer the resulting bit pattern can only equal 4294967295.  And, that value is used by dechex in constructing its translation of -1 into the hex string ‘FFFFFFFF’.  Note each four-bits of a binary string (or nibble) represents one hex digit.  For this reason, when you pass the hex string to the companion function, hexdec, the function fails to respond with -1, the original value that was passed to dechex, and instead returns 4294967295.  This number is, of course, too large to store as an integer in PHP on a 32 bit system, so C converts it to a double before handing the double over to PHP.

Tying Up Loose Ends

All this discussion about the 2s Complement method may leave you wondering what is so complementary about it?  The resulting binary bit pattern which may express a positive or negative number yields an interesting sort of equivalence between the two numbers. If you add the absolute values of the negative and positive values their sum will equal 232, a fact which makes them complementary to each other.  You can say, for example, that -1 is the two’s complement of 4294967295.  The mathematical way to express this equivalence or congruence is to say that -1 is congruent to 4294967295 modulo 232.  You can prove this by subtracting either number from the other and then divide the result by 232, i.e.4294967296. The result will be zero.

I intend to speak to more about hexadecimals and other numerical topics with respect to various quirks and gotchas.

In the meantime, happy PHPing!

a5dbe5842ded934f20da55984705dcb3fade70fd-s300

Sharon is a Zend Certified Engineer providing professional web development services for companies in diverse industries, from startups to the Fortune 500 on the UNIX, Linux or Windows platforms.

Her public speaking engagements include presentations at code camps and PHP meetups,  2009 CodeWorks Los Angeles,  the ZendCon unCon, SCALE (WIOS), and Zend Technologies webinars.

php|architect has published several of her articles on subjects ranging from email verification to web-based retrieval, to PHP’s support for closures.