DrUUID RFC 4122 library manual

Preface

"DrUUID" is an implementation for PHP5 of---and associated API for---the data format described in RFC 4122, A Universally Unique IDentifier (UUID) URN Namespace. It is able to mint new UUIDs, import existing UUIDs, extract information from UUIDs, and compare two UUIDs for bit-exact equality.

The API is designed to be as simple as possible, with an implementation as accurate as practical given the limits of PHP. All other concerns, including PHP compatibility, efficiency and extensibility are secondary.

Questions and comments are very welcome, and should be directed to the author via his Web site.

Download current version (archives)

Table of contents

Prefaces

  1. Preface
  2. Table of contents
  3. Features
  4. Requirements
  5. Conformance exceptions

Body

  1. Documentation note
  2. The API
    1. UUID::mint()
      1. Version 1
      2. Version 3
      3. Version 4
      4. Version 5
    2. UUID::import()
    3. UUID::compare()
    4. UUID::mintStr()
  3. The UUID object
    1. Public properties
  4. Runtime configuration
    1. UUID::initRandom()
    2. UUID::initBignum()
    3. UUID::initStorage()
    4. UUID::initAccurate()
  5. Using custom stable storage implementations
    1. UUID::registerStorage()
    2. The UUIDStorage interface

Appendices

  1. Defined constants
  2. Exception codes
  3. Flow chart for storage interface
  4. Credits and licensing
  5. Revision history

Features

Requirements

Behaviour may be erratic with PHP versions earlier than 5.1 due to bugs in string casting for objects.

Conformance exceptions

The last three points only apply if DrUUID is used in its default configuration; the first point is not a conformance violation, but is sub-optimal. See Section 4 for further details on configuring for optimal accuracy.

Documentation note

This manual often makes references to binary and hexdecimal strings for input and output. For the sake of simplicity please assume that such strings are always in network order (big-endian).

For methods accepting a UUID as an argument, the UUID may be:

This manual often makes reference to invalid UUIDs. For simplicity this is merely any string more or less than 16 bytes long. DrUUID performs no other validation on UUIDs.

The API

The core DrUUID API consists of four static methods: UUID::mint(), UUID::import(), UUID::compare() as well as UUID::mintStr(). Of these UUID::mint() and UUID::import() return an instance of the UUID class.

UUID::mint()

UUID UUID::mint( [int version [, ... ]] )

The UUID::mint() method generates ("mints", like coinage) a new UUID. It is capable of producing Version 1 (time-based), Version 3 (MD5 hash-based), Version 4 (random) and Version 5 (SHA-1 hash-based) UUIDs. Its argument list is generic: required and optional argument depend upon the specified version to produce.

Version 1

UUID UUID::mint( void )
UUID UUID::mint( 1 [, string node [, string sequence [, string time ]]] )

Version 1 UUIDs (the default type) are generated based on the current time and a MAC address (called a node).

If specified, node should be either a 6-byte binary string or a 12-character hexadecimal string (with or without separators) representing a MAC address. DrUUID does not attempt to detect the host's MAC address. Invalid nodes will throw an exception.

The sequence argument specifies a clock sequence and should be a two-byte binary string. This should only be used for debugging. Invalid sequences will throw an exception.

Finally, the time argument may be specified to employ a past or future time (as a Unix timestamp with microseconds like that returned by microtime() for example) instead of the curent time. This should only be used for debugging and never used to generate UUIDs for any purpose but testing. Input which cannot be parsed as a timestamp will throw an exception.

Version 3

UUID UUID::mint( 3, string name, mixed namespace )

Version 3 UUIDs are generated based upon a an MD5 hash of an arbitrary name and its associated name-space. For example the name "www.example.com" is within the DNS namespace, much as "Canada" is within a name-space of the world's countries. A name/namespace pair will predictably generate the same UUID.

The name argument is an arbitrary name and should be in a binary form appropriate for namespace. It is the responsibility of the user to assure the proper conversion to binary form. For many namespaces (like the DNS) the appropriate representation is plain text and therefore no conversion is required.

The namespace argument is itself a UUID; invalid UUIDs will throw an exception.

Note that the continued use of Version 3 UUIDs is discouraged: Version 5 UUIDs should be used instead whenever possible.

Version 4

UUID UUID::mint( 4 )

Version 4 UUIDs are generated from random numbers. Save for embedded version information they are completely random.

Version 5

UUID UUID::mint( 5, string name, mixed namespace )

Version 5 UUIDs are generated based upon a an SHA-1 hash of an arbitrary name and its associated name-space. For example the name "www.example.com" is within the DNS namespace, much as "Canada" is within a name-space of the world's countries. A name/namespace pair will predictably generate the same UUID.

The name argument is an arbitrary name and should be in a binary form appropriate for namespace. It is the responsibility of the user to assure the proper conversion to binary form. For many namespaces (like the DNS) the appropriate representation is plain text and therefore no conversion is required.

The namespace argument is itself a UUID; invalid UUIDs will throw an exception.

Version 5 UUIDs are preferred over Version 3 UUIDs.

UUID::import()

UUID UUID::import( string uuid )

The UUID::import() method imports a UUID string as a UUID object. Invalid UUIDs will throw an exception.

UUID::compare()

bool UUID::compare( mixed uuid1, mixed uuid2 )

The UUID::compare() method compares two UUIDs for equivalency. If both UUIDs, as binary numbers, are equal, the method returns TRUE. The method will also return TRUE if neither arguments is a valid UUID.

This method is useful for determining if two different UUID representations (eg. canonical string, lowercase hex string, uppercase hex string, binary, URN) are in fact the same UUID.

UUID::mintStr()

string UUID::mintStr( [int version [, ... ] )

The UUID::mintStr() method performs the same functions as the UUID::mint() method, but returns the UUID directly as a string in canonical form.

The UUID object

UUID objects cannot be instantiated manually; they must be created via UUID::mint() or UUID::import(). When cast to a string a UUID object will be rendered in the canonical string form (eg. 550e8400-e29b-41d4-a716-446655440000). They have no public methods, but do have a number of public properties:

Public properties

bytes
A 16-byte binary string representation of the UUID.
hex
A 32-character hexadecimal representation of the UUID. Neither octets nor fields are ever padded and high digits are always lowercased.
string
The canonical string representation of the UUID, with high hexadecimal digits always lowercased.
urn
The UUID formatted as an URN.
version
The UUID's version (eg. 1, 3, 4, 5).
variant
The UUID's variant. For RFC 4122 UUIDs this is always 1.
node
The MAC address associated with the UUID. Only applicable to Version 1 identifiers.
time
The time at which the UUID was generated, as a fixed-point Unix timestamp string with seven-digit sub-second precision. Only applicable to Version 1 identifiers.

Runtime configuration

UUID::initRandom()

PHP has a number of good sources for random numbers available to it, but most are either system-dependent, incur considerable overhead, or both. Consequently DrUUID uses the mt_rand() function to generate random numbers unless instructed to seek an alternative, usually cryptographically secure source. In order to use an alternative source the UUID::initRandom() static method must be invoked.

int UUID::initRandom( [int source] )

If envoked without arguments, the UUID::initRandom() method will attempt to make use of the best available randomness source; this may nevertheless be mt_rand().

An integer constant, source, may be passed to explicitly choose a source. Passing an unknown value will throw an exception; an exception will also be thrown if an explicitly selected source is not available.

A list of valid source constants is available in Appendix A.

Note that since the characteristics of any given system can be unpredictably different from those of another, users are encouraged to run their own benchmarks to ascertain whether the performance of both UUID::initRandom() and calls thereafter to UUID::randomBytes() warrant an alternative source's use.

UUID::initBignum()

For Version 1 UUIDs, a 60-bit timestamp must be generated. On 32-bit systems, this causes PHP to use floating-point arithmetic, which yields inaccurate results, with precision only reliable to the millisecond rather than the microsecond. If either the GMP or BC Math extension is available, DrUUID can make use of it to produce accurate results:

int UUID::initBignum( [int means] )

If envoked without arguments, the UUID::initBignum() method will try to use the fastest means of producing accurate timestamps.

An integer constant, means, may be passed to explicitly choose a method. Passing an unknown value will throw an exception; an exception will also be thrown if an explicitly selected method is not available.

A list of valid means constants is available in Appendix A.

On 64-bit systems DrUUID will use accurate arithmetic without having to call UUID::initBignum().

UUID::initStorage()

In order to ensure uniqueness of Version 1 UUIDs, the value of the clock sequence, node ID and last timestamp used should be kept in stable storage for reference. By default DrUUID only keeps these values in memory, but they can also be written to a file:

void UUID::initStorage( string path )

DrUUID will read state from and write state to the file specified by path. If the file specified is not accessible, an exception will be thrown. The default implementation will create a file which does not exist, but will not create folders.

If access to a file is either impossible or impractical, an API for implementing a custom storage is described in Section 5. If using custom storage, required arguments may be different.

UUID::initAccurate()

void UUID::initAccurate( string path )

The UUID::initAccurate() method is a shortcut to achieving optimal accuracy. It successively calls UUID::initBignum(), UUID::initRandom(), and UUID::initStorage(). Unlike calling the three methods by themselves, however, UUID::initAccurate() will reject results which will yield inaccurate UUIDs, and will throw an exception accordingly.

Using custom stable storage implementations

DrUUID includes a basic implementation of stable storage for Version 1 UUIDs which is consistent with Section 4.2.1 of RFC 4122. This implementation, however, is not especially efficient if UUIDs are expected to be created in bulk in a single session, nor can it write to a back-end other than a file. For more complex requirements, an API is available to allow DrUUID to communicate with alternative storage backends or otherwise tailor the implementation to individual needs.

UUID::registerStorage()

void UUID::registerStorage( string class_name [, mixed arg ... ] )

The class_name argument must be the name of a defined class which imprements the UUIDStorage interface, described below. Any further arguments will be passed to UUID::initStorage().

If no supplementary arguments are passed, UUID::initStorage() must be called before the custom storage may be used.

The UUIDStorage interface

interface UUIDStorage {
 public function getNode();
 public function getSequence($timestamp, $node);
 public function setSequence($sequence);
 public function setTimestamp($timestamp, $sequence, $node);
 const maxSequence = 16383; // 00111111 11111111
}

The UUIDStorage interface defines a set of methods which DrUUID will call during the generation of Version 1 UUIDs in a predictable order to query storage and write data. A visualization of the process is available in Appendix C for reference. The order of the method calls is as follows:

  1. If a node ID is already available from the user, skip to Step 4
  2. Rretrieve the last known node ID (MAC address) by calling getNode()
    • The storage may attempt to retrieve the actual MAC address; otherwise it returns the stored one, or NULL
  3. If the node ID is NULL because an existing value is not available, generate a new random node ID
  4. If the user has supplied a clock sequence for debugging, call setSequence() and skip to Step 7
  5. Retrieve the clock sequence by calling getSequence(), passing the timestamp and node ID
    • The storage invalidates any stored clock sequence if the node ID provided does not match that stored
    • The storage increments the stored clock sequence before returning it if the timestamp provided is older than that stored
  6. If the clock sequence is NULL because an existing value was not available or was invalidated, generate a new random clock sequence and call setSequence()
  7. Call setTimestamp() to update the stored timestamp, signalling the end of communication

The following subsections serve as implementation notes for the interface's methods.

getNode()

As DrUUID is unable to retrieve the system's MAC address, it calls the getNode() method, which might implement a means of doing so or retrieve one from storage. If it does return a value, it should be formatted as six bytes, in big-endian order (the reverse of conventional hexdecimal pair representation).

getSequence()

The getSequence() method is the heart of the interface, taking as input the target timestamp (as a number of 100ns ticks since the Unix epoch) and the node ID (as a six-byte string). Output should be a two-byte string, with the two most significant bits set to zero.

Per Section 4.1.5 of RFC 4122, the clock sequence should be randomized if the node ID changes, and should be incremented if the target timestamp is lower than that in storage. Due to the limits of 32-bit systems and the difficulties inherent in comparing floating-point numbers, the input timestamp is always a string with integer precision.

setSequence()

This method simply alerts the storage of a new clock sequence, if either the user has supplied a sequence or the storage failed to return a result. Input is a two-byte string; no return value is required.

setTimestamp()

The method serves as a marker that communication with the storage is complete and any buffered data may be written to stable storage if appropriate. Input is a string representation of the number of 100ns ticks since the Unix epoch.

Predefined constants

For convenience DrUUID includes a number of class constants, with three distinct groups:

These groups are documented in this appendix.

Randomness source constants
Constant Description Value
UUID::randChoose For auto-detection. The best available randomness source will be used. -1
UUID::randPoor PHP's mt_rand() function, the lowest common denominator. 0
UUID::randDev The /dev/urandom pseudo-device file, available on most Unix-like systems. 1
UUID::randCAPICOM COM calls to CAPICOM's GetRandom method, available on Windows prior to Windows 7. 2
UUID::randOpenSSL The openssl_random_pseudo_bytes() function, available since PHP 5.3.0. 3
UUID::randMcrypt The mcrypt_create_iv() function. 4
Bignum method constants
Constant Description Value
UUID::bigChoose For auto-detection. Native 64-bit will trump GMP, which will trump BC Math -1
UUID::bigNot No 64-bit integer/bignum support. This will produce inaccurate results. 0
UUID::bigNative Native 64-bit integer support. This is fastest. 1
UUID::bigGMP GNU Multiple Precision library. 2
UUID::bigBC BC Math library. 3
Namespace constants
Constant Namespace description UUID
UUID::nsDNS DNS hostnames (eg. "www.example.com") 6ba7b810-9dad-11d1-80b4-00c04fd430c8
UUID::nsURL Any valid URL (eg. "http://www.example.com/example.html") 6ba7b811-9dad-11d1-80b4-00c04fd430c8
UUID::nsOID An ISO Object Identifier 6ba7b812-9dad-11d1-80b4-00c04fd430c8
UUID::nsX500 An X.500 Distinguished Name 6ba7b814-9dad-11d1-80b4-00c04fd430c8

Exception codes

DrUUID will throw either UUIDException or UUIDStorageException exceptions under various circumstances. Details on these exceptions are below.

Exception details
Code Type Description Public methods [1]
Notes:
  1. References to UUID::mint() should be understood to include UUID::mintStr().
  2. As a chained exception.
0001UUIDExceptionSelected version is invalid or unsupported. UUID::mint()
0002UUIDExceptionVersion 2 is unsupported. UUID::mint()
0003UUIDExceptionInput must be a valid UUID. UUID::import()
0101UUIDExceptionNode must be a valid MAC address. UUID::mint(1)
0102UUIDExceptionClock sequence must be a two-byte binary string. UUID::mint(1)
0103UUIDExceptionTime input was of an unexpected format. UUID::mint(1)
0201UUIDExceptionA name-string is required for Version 3 or 5 UUIDs. UUID::mint(3)
UUID::mint(5)
0202UUIDExceptionA valid UUID namespace is required for Version 3 or 5 UUIDs. UUID::mint(3)
UUID::mint(5)
0801UUIDExceptionBignum method is not available. UUID::initBignum()
0802UUIDExceptionRandomness strategy is not available. UUID::initRandom()
0901UUIDExceptionBignum method not implemented. UUID::initBignum()
0902UUIDExceptionRandomness strategy not implemented. UUID::initRandom()
1001UUIDStorageExceptionStorage class does not exist. UUID::registerStorage()
1002UUIDStorageExceptionStorage class does not implement the UUIDStorage interface. UUID::registerStorage()
1003UUIDStorageExceptionStorage class could not be instantiated with supplied arguments. UUID::initStorage()
UUID::registerStorage()
UUID::initAccurate() [2]
1101UUIDStorageExceptionStable storage is not readable. UUID::initStorage()
UUID::initAccurate() [2]
1102UUIDStorageExceptionStable storage is not writable. UUID::initStorage()
UUID::initAccurate() [2]
1201UUIDStorageExceptionStable storage could not be read. UUID::mint(1)
1202UUIDStorageExceptionStable storage could not be written. UUID::mint(1)
1203UUIDStorageExceptionStable storage data is invalid or corrupted. UUID::mint(1)
2001UUIDException64-bit integer arithmetic is not available. UUID::initAccurate()
2002UUIDExceptionSecure random number generator is not available. UUID::initAccurate()
2003UUIDStorageExceptionStable storage not available. UUID::initAccurate()
2004UUIDStorageExceptionStorage is invalid. UUID::initAccurate()

Flow chart for storage interface

Begin Have Node User Seq setSequence() setTimestamp() Done getSequence() Null Generate Sequence getNode() Null Generate Node ID yes no yes no yes no yes no

Credits and licensing

DrUUID and its manual (i.e. this document) were written by J. King. They are both governed by the following license:

Copyright (c) 2009 J. King

Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.

This manual's stylesheet was written by Dustin Wilson. It is licensed under the Creative Commons Attribution license (v2.5).

This software is dedicated to Seung Park. HLN forever!

Revision history

2014-09-06
Major enhancements:
2011-03-20
Refined the generation of Version 1 UUIDs. This sees the addition of the sequence and time parameters to UUID::mint(1), as well as the addition of UUID::seq().
2010-02-15
Fixed bug in UUID::import as reported by Sander van Lambalgen.
2009-11-26
Fixed previously non-functional UUID::compare() method. Also allowed input UUIDs to be RFC 4122 URNs.
2009-11-11
Various changes:
2009-09-28
Fixed a minor bug preventing /dev/urandom from being used. Reported by Rubén Marrero.
2009-04-13
Fixed two serious bugs in Version 5 generation and string casting.
2009-04-11
First release.