Media Den logo

Media Den

← Blog

From Password to Key: How PBKDF2 Bridges the Gap

Encryption algorithms don't work with passwords. AES-256-GCM, the algorithm behind most modern encryption (including Media Den), requires a 256-bit key, a sequence of ones and zeros that looks like structured randomness. Your password is "Melanie2019!" or something worse. These two things are not the same, and the gap between them is where a lot of security falls apart.

The function that bridges that gap is called a key derivation function, or KDF. The most widely deployed one is PBKDF2, and it has been in use for over twenty-five years. Here's how it got here.

Storing Passwords Without Storing Passwords

The first serious attempt to handle passwords securely came from Bell Labs in the late 1970s. Robert Morris and Ken Thompson, both working on Unix, published a short paper in 1979 called "Password Security: A Case History."[1] The paper described a problem they had observed firsthand: Unix stored user passwords in a file, and anyone who could read that file could log in as anyone else.

Their solution was to stop storing passwords entirely. Instead, when a user set a password, Unix would run it through a modified version of DES (the Data Encryption Standard) and store only the output, a hash. When the user later tried to log in, the system would hash whatever they typed and compare it to the stored hash. If the two matched, the password was correct. The original password was never written to disk.

This worked, but Morris and Thompson identified two problems immediately. First, if two users had the same password, they would produce the same hash. An attacker who cracked one had cracked both. Second, an attacker could precompute a table of common passwords and their hashes, then look up any stolen hash in the table instantly.

Example
Without a salt, two users who both choose "hunter2" produce the same stored hash:
alice → hash("hunter2") → 2ab96390c7
bob   → hash("hunter2") → 2ab96390c7

To address this, they introduced the concept of a salt: a small random value (12 bits, in their implementation) appended to the password before hashing. The salt was stored alongside the hash in the clear; it didn't need to be secret. Its purpose was to ensure that the same password, chosen by two different users, would produce two different hashes. It also meant that any precomputed table would need to cover every possible salt value, multiplying the attacker's storage requirements by 4,096.

Example
With a random salt stored alongside each entry, the same password produces completely different hashes:
alice → hash("hunter2" + f7a3) → 8c1f40e72d
bob   → hash("hunter2" + 2e0b) → 5d3b9af104

They also made the hashing process deliberately slow by running DES 25 times in sequence, rather than once. This had negligible effect on a user logging in (a fraction of a second), but multiplied the cost of trying millions of passwords by 25.

These two ideas, salt and slow hashing, became the foundation of password-based key derivation. But in 1979, they were implemented as a one-off Unix mechanism, not as a general-purpose tool.

The Precomputation Problem Gets Worse

For two decades, the salt-and-hash approach held up reasonably well. Then hardware got faster, and a more efficient class of attack arrived.

In 2003, Philippe Oechslin, a researcher at the Swiss Federal Institute of Technology, presented a technique at the Crypto 2003 conference that made precomputation attacks far more practical.[2] Building on a time-memory trade-off first described by Martin Hellman in 1980,[10] Oechslin introduced a refinement he called rainbow tables. Like Hellman's original scheme, the technique avoided storing every possible hash by chaining hashes and reduction functions together and keeping only the endpoints. Oechslin's key insight was to use a different reduction function at each step in the chain, which prevented chains from merging and made the tables much more efficient.

Using 1.4 GB of precomputed data, Oechslin demonstrated that 99.9% of alphanumeric Windows LAN Manager password hashes could be recovered in 13.6 seconds. This worked because LAN Manager hashes had no salt: every system that hashed the same password produced the same output. One rainbow table worked everywhere.

Salting rendered rainbow tables ineffective, since each salt creates what is effectively a different hash function, and a separate table would be needed for every possible salt value. But Oechslin's work showed what happened when systems omitted it. Plenty of systems still did.

Formalizing the Fix

While rainbow tables dramatized the consequences of poor password handling, the formal solution had actually been published three years earlier.

In September 2000, Burt Kaliski at RSA Laboratories published PKCS #5 version 2.0, later released as RFC 2898.[3] The specification defined two key derivation functions: PBKDF1 and PBKDF2. PBKDF1 was included for backward compatibility, but it had a hard limitation: it could only produce derived keys up to 160 bits long, which was already insufficient for newer encryption algorithms. PBKDF2 removed that ceiling.

Kaliski's spec took the principles Morris and Thompson had applied informally and turned them into a general-purpose construction with clearly defined inputs: a password, a salt (recommended minimum of 64 bits), an iteration count, a pseudorandom function (HMAC-SHA1 by default), and a desired output key length.

The iteration count was the critical design decision. Rather than fixing the number of rounds (as Unix's DES-based scheme had done with 25), PBKDF2 made it a parameter that applications could increase over time. The specification recommended a minimum of 1,000 iterations as a starting point in the year 2000, with the explicit expectation that this number would grow as processors got faster.[3]

How It Works

PBKDF2's mechanism is straightforward. Given a password P, a salt S, an iteration count c, and a desired key length, it works as follows:

It computes HMAC(P, S || 1), where the password is the HMAC key and the salt is concatenated with a block counter as the input. Call this result U₁. Then it computes HMAC(P, U₁) to get U₂, then HMAC(P, U₂) to get U₃, and so on for c iterations. The final output for that block is U₁ XOR U₂ XOR U₃ XOR ... XOR Uₓ. If more key material is needed than a single HMAC output provides, the process repeats with an incremented block counter.

Each iteration feeds into the next, so the full chain must be computed in sequence. The intermediate results are XORed together to produce the final output, ensuring it reflects the entire history of the computation rather than just the last step.

Each iteration depends on the result of the previous one. This chain of dependency means the inner loop cannot be parallelized. You have to compute each step in sequence. An attacker trying a million passwords has to run the full chain of iterations for each one.

The result is what cryptographers call key stretching:[8] taking a low-entropy input (a human-chosen password) and making it computationally expensive to reverse, without changing the effort required for a legitimate user who only needs to derive the key once.

The Arms Race

The iteration count was designed to be a moving target, and it has moved considerably.

When PKCS #5 v2.0 was published in 2000, the recommended minimum was 1,000 iterations. By 2005, a Kerberos standard called for 4,096.[4] Apple reportedly used 2,000 iterations on iOS 3, then raised it to 10,000 for iOS 4. LastPass, in 2011, used 5,000 iterations for its JavaScript client and 100,000 for server-side hashing.[4]

By 2023, OWASP's recommendation had reached 600,000 iterations for PBKDF2-HMAC-SHA256 and 210,000 for PBKDF2-HMAC-SHA512.[4]

This escalation is exactly what the specification intended. PBKDF2 was not designed to be permanently configured; it was designed to be recalibrated as hardware improved. A system that still uses 1,000 iterations in 2026 is not using PBKDF2 as designed.

What PBKDF2 Doesn't Do

PBKDF2 is CPU-intensive but not memory-intensive. It can run with very little RAM, which means an attacker with GPUs or custom hardware (ASICs) can run many PBKDF2 computations in parallel cheaply. For applications like login systems, where the goal is to verify a password, newer functions like bcrypt,[5] scrypt,[6] and Argon2[7] address this by requiring significant memory per computation, making parallel attacks much harder.

In practice, PBKDF2 is one layer in a broader security stack. An attacker would first need to get past the protections on the user's device (iOS sandboxing, device encryption, biometric authentication) or compromise their cloud storage account, which has its own defenses: two-factor authentication, unique passwords, passkeys. Only then would they face the cost of brute-forcing PBKDF2 itself.

Still Here, Still Everywhere

Despite being over twenty-five years old, PBKDF2 is still embedded across a wide range of systems.

RFC 8018 (PKCS #5 v2.1), published in 2017, continues to recommend it.[9] It is the key derivation function used in WPA2 Wi-Fi security. It is used in disk encryption implementations, password managers, and encrypted file formats. Apple's iOS uses it for keychain derivation. NIST SP 800-132 specifically covers its use for key derivation from passwords.[8]

PBKDF2 in Media Den

When you set a password in Media Den, the app never uses it directly as an encryption key. Instead, for each file you encrypt, a fresh random salt is generated and combined with your password through PBKDF2-HMAC-SHA256. The output is a 256-bit key unique to that file, which is then used with AES-256-GCM to encrypt the data.

Because each file has its own salt, each file has its own derived key, even though they all come from the same password. If an attacker somehow recovered one derived key, it would tell them nothing about the key protecting any other file, and nothing about the password itself.

Your password is the starting point. PBKDF2 is the bridge. AES-256-GCM is the lock. None of them work alone.

Learn more about Media Den →

Download on the App Store Download on the App Store

References

  1. R. Morris and K. Thompson, "Password Security: A Case History," Communications of the ACM, vol. 22, no. 11, pp. 594–597, November 1979. dl.acm.org
  2. P. Oechslin, "Making a Faster Cryptanalytic Time-Memory Trade-Off," Advances in Cryptology — CRYPTO 2003, Lecture Notes in Computer Science, vol. 2729, pp. 617–630, 2003. iacr.org
  3. B. Kaliski, "PKCS #5: Password-Based Cryptography Specification Version 2.0," RFC 2898, RSA Laboratories, September 2000. rfc-editor.org
  4. Wikipedia, "PBKDF2." Covers iteration count recommendations across implementations and OWASP guidelines. en.wikipedia.org
  5. N. Provos and D. Mazières, "A Future-Adaptable Password Scheme," Proceedings of the USENIX Annual Technical Conference, 1999. usenix.org
  6. C. Percival, "Stronger Key Derivation via Sequential Memory-Hard Functions," presented at BSDCan, 2009. tarsnap.com
  7. A. Biryukov, D. Dinu, and D. Khovratovich, "Argon2: the memory-hard function for password hashing and other applications," winner of the Password Hashing Competition, 2015. password-hashing.net
  8. NIST, "Special Publication 800-132: Recommendation for Password-Based Key Derivation," December 2010. nist.gov
  9. K. Moriarty, B. Kaliski, and A. Rusch, "PKCS #5: Password-Based Cryptography Specification Version 2.1," RFC 8018, January 2017. rfc-editor.org
  10. M. E. Hellman, "A Cryptanalytic Time-Memory Trade-Off," IEEE Transactions on Information Theory, vol. 26, no. 4, pp. 401–406, July 1980. ieeexplore.ieee.org