Why Your ID Strategy Matters
Every record in your database needs an identifier. Sounds simple, but this decision cascades through your entire system:
- Sequential IDs (1, 2, 3...) leak information about your growth and create enumeration attacks - UUIDs are globally unique but verbose and slower to index - Hashes provide fingerprinting but are not reversible - Custom formats (invoice numbers, order codes) need to balance human-readability with uniqueness
The right choice depends on your security requirements, database performance, API design, and whether you need offline ID generation. Let's break down each approach.
Sequential IDs: Simple but Leaky
What they are: Auto-incrementing integers (1, 2, 3, 4...)
Pros: - Simple to implement (`AUTO_INCREMENT` in MySQL, `SERIAL` in PostgreSQL) - Compact (4-8 bytes vs. 16+ for UUIDs) - Fast indexing and sorting - Human-readable
Cons: - Information leakage: User #47 knows there are only 46 other users - Enumeration attacks: Scraping `/api/invoices/1`, `/api/invoices/2`... is trivial - Merge conflicts: Hard to combine databases without ID collisions - Predictability: Easy to guess valid IDs
When to use: Internal systems, timestamps, logs, or anything where the sequence itself is valuable (invoice numbers where chronological order matters).
South African example: SARS eFiling reference numbers use sequential formats but obfuscate with check digits and prefixes (e.g., `EMP-2026-00047-A`).
UUIDs: Globally Unique, Locally Verbose
What they are: 128-bit identifiers like `550e8400-e29b-41d4-a716-446655440000`
Pros: - Globally unique (generate offline without database coordination) - No enumeration attacks (next ID is unpredictable) - Safe for distributed systems and merging databases - Standard format (RFC 4122)
Cons: - Size: 36 characters as a string, 16 bytes as binary (vs. 4-8 for integers) - Indexing overhead: Slower than sequential IDs in most databases - Not human-friendly: Hard to read, type, or communicate verbally
UUID versions: - v1: Timestamp + MAC address (reveals creation time and device) - v4: Random (most common, no information leakage) - v7: Timestamp-ordered (new, combines UUID benefits with sequential indexing performance)
When to use: Public APIs, user-facing IDs, microservices, offline-first apps, any system where enumeration is a risk.
Tip: Use UUID v7 if your database supports it (PostgreSQL 13+, MySQL 8+). You get uniqueness *and* index-friendly ordering.
Hashes: Fingerprints for Content
What they are: Cryptographic hashes (MD5, SHA-256) that deterministically map input to a fixed-size output
Pros: - Content-addressable: Same input always produces the same hash - Tamper detection: Changing one bit changes the entire hash - Deduplication: Identical files produce identical hashes - Fixed size: SHA-256 is always 64 hex characters
Cons: - Not reversible: You cannot extract the original data from a hash - Collision risk: Rare but possible (use SHA-256 or better, not MD5) - Overkill for simple IDs: If you just need uniqueness, UUIDs are simpler
When to use: - File uploads (detect duplicates, verify integrity) - API request signatures (HMAC authentication) - Password storage (bcrypt, Argon2 — never plain SHA!) - Cache keys (hash the query parameters)
Example: Your image uploader receives `photo.jpg`. Hash it with SHA-256 → `a3f8b9c1d2e4f5...` → store as `a3f8b9c1.jpg`. Same photo uploaded twice? Auto-deduplicated.
Custom Formats: Human-Readable IDs
What they are: Domain-specific formats like `INV-2026-0047`, `ORD-ZA-A3F8`, `TRX-20260304-1200`
Pros: - Context at a glance: `INV-` tells you it's an invoice - Human-friendly: Easier to communicate ("Invoice 47" vs. "550e8400-e29b") - Sortable: Include timestamps or sequential components - Branding: Matches your business domain
Cons: - Collision risk: You must ensure uniqueness (timestamp + random suffix works well) - Parsing complexity: Regex validation, format enforcement - Migration pain: Changing the format later is hard
Recipe for safe custom IDs: 1. Prefix (type identifier): `INV-`, `ORD-`, `TRX-` 2. Date component (sortable): `20260304` or `2026-03` 3. Unique suffix: Random 4-6 characters or auto-increment scoped to the date 4. Optional check digit (Luhn algorithm) to catch typos
Example: `INV-2026-03-A4F9` = Invoice from March 2026, unique suffix A4F9. Use the Invoice Number Generator to prototype custom formats.
Decision Matrix: Which ID to Use
Use sequential IDs when: - Internal-only system (no public API exposure) - Chronological order is important (audit logs, event streams) - Database performance is critical and scale is moderate
Use UUIDs when: - Public API or user-facing IDs - Distributed system or offline ID generation - Security: prevent enumeration attacks - Merging databases from multiple sources
Use hashes when: - Content-addressable storage (files, images) - Deduplication is valuable - Integrity verification matters (downloads, backups) - API authentication (HMAC signatures)
Use custom formats when: - Human readability is critical (invoices, support tickets) - Domain-specific context helps (prefixes like `INV-`, `ORD-`) - You need both uniqueness and branding
Pro tip: You can mix strategies. Use UUIDs for user IDs, sequential for internal logs, hashes for file uploads, and custom formats for invoices. Different problems, different solutions.
Try the UUID Generator and Hash Generator to experiment with different formats and see what fits your use case.