How UUIDs work internally - 128-bit structure diagram

UUID and IDs 19 min read

How UUIDs work internally

May 12, 2026 · 15 min read

Strings like 550e8400-e29b-41d4-a716-446655440000 are just one skin on 16 bytes. When integrations fail, the bug is almost always byte order or version bits, not "broken UUID math."

The 128-bit layout

RFC 4122 names five logical fields packed into 128 bits:

Random v4 reuses this skeleton but fills most bits with entropy. v7 overwrites the front with a Unix timestamp. The hyphenated string is big-endian hex presentation of those fields in a defined order.

Example: 550e8400-e29b-41d4-a716-446655440000
         |time_low |mid |ver|seq|      node      |
Version nibble ----^ (4 in "41d4" -> 0x4xxx)

Version and variant nibbles

The version sits in the high nibble of time_hi_and_version. Valid values include 1, 3, 4, 5, 6, 7, 8. The variant is encoded in the most significant bits of the clock sequence field; RFC 4122 variant starts with 10. Validators reject strings where these bits are wrong even if hex length is perfect.

Example: a string with version nibble 0 or variant that does not match RFC 4122 is not a valid UUID even if every character is hexadecimal. That is why copy-paste from corrupted logs often fails validation - one nibble flipped during OCR or PDF export breaks structural rules.

How v4 and v7 populate the same skeleton

Version 4 sets the version to 4 and fills the remaining 122 bits with cryptographically strong randomness (implementation quality matters - use OS CSPRNG, not Math.random()).

Version 7 writes a 48-bit Unix timestamp in milliseconds into the most significant bits, then random data for the rest. The result still parses as a UUID string, but the leading hex changes slowly over time, which is why indexes behave more like auto-increment integers than like random v4 keys.

v4: random everywhere (except version/variant)
v7: | timestamp (48b) | random (80b) |  (simplified view)

Endianness: where Microsoft GUIDs diverge

The first three fields (time_low, time_mid, time_hi_and_version) are stored little-endian in some ecosystems (.NET Guid, SQL Server uniqueidentifier on wire). The last two fields remain big-endian. Copying hex dumps between Java and C# without conversion produces "different" UUIDs for the same logical value.

MongoDB Extended JSON uses subtype 04 (standard) or subtype 03 (legacy C#). Our MongoDB UUID converter exists because this confusion is routine in data migrations.

How databases store UUIDs

Random v4 in BINARY(16) still fragments indexes. v7 or ULID improves sequential insert performance because leading bytes change slowly.

-- PostgreSQL: store as native uuid, compare efficiently
SELECT id FROM users WHERE id = '550e8400-e29b-41d4-a716-446655440000'::uuid;

-- MySQL 8+: UUID_TO_BIN / BIN_TO_UUID for byte order control
SELECT UUID_TO_BIN('550e8400-e29b-41d4-a716-446655440000', 1);

When exporting to data warehouses, teams often denormalize UUID to string for Parquet readability. Re-importing those files is where endianness bugs return - always store the canonical string alongside raw bytes if you need lossless round-trip.

A practical debugging workflow

  1. Normalize to lowercase canonical string.
  2. Confirm version nibble matches generator you think you used.
  3. Dump 16-byte hex from DB and compare to converter output.
  4. If mismatch only in first 8 bytes, suspect endianness.
  5. Log both string and binary during one request to catch driver bugs early.

FAQ

Why does my UUID start with ff?
Check if you are viewing binary as hex without proper field boundaries, or if data is not a UUID at all.
Are UUIDs always 36 characters?
Canonical hyphenated form yes. Compact 32-hex and URN forms are common alternatives.
Can two different strings be the same UUID?
Yes if casing differs or leading zeros are omitted in non-canonical parsers. Always use a strict validator.
What is the UUID epoch?
v1 timestamps count 100-ns intervals since 1582-10-15 UTC. v7 uses Unix ms. Do not mix semantics when parsing.
How many bytes is a UUID on the wire?
Always 16 bytes. String form is a presentation layer.

Related: How to validate UUIDs · UUID validator tool