Skip to content
All writing
Technical · 9 min

Changing a Password Is a One-Liner — Unless You Encrypt Everything

Crash-safe master-password rotation in a zero-knowledge journaling app: re-encrypting across two layers and two runtimes with a single atomic flip a kill -9 can't corrupt.

Contents

How I built crash-safe master-password rotation in a zero-knowledge journaling app: across two encryption layers, two runtimes, and a single atomic flip that a kill -9 at any step can’t corrupt.


The naïve expectation

Settings → Change Password. There’s a row for it in every app. How hard can it be — overwrite the stored hash, done?

That instinct is exactly wrong for MoodHaven, and the reason it’s wrong is the whole post.

The reveal: there’s nothing to “update”

MoodHaven is zero-knowledge. The password is never stored — not the password, not a reversible form of it. It’s run through PBKDF2-HMAC-SHA256 (600k iterations) to derive the keys that encrypt your data, and those keys live only in memory while the app is unlocked. “We can’t read your journal” isn’t marketing; it’s that there is no stored key to read.

Which means there is no field to overwrite. The password is the key. Changing it means re-deriving every key that hangs off it and re-encrypting everything those keys protect. And in MoodHaven that’s two independent layers:

  • Outer — SQLCipher whole-DB. The entire database file is encrypted with a raw 256-bit key = PBKDF2(password, db_salt, 600k), applied as a raw PRAGMA key = "x'<hex>'". The salt lives in db_state.json.
  • Inner — per-field AES-256-GCM. Journal text, signals, the TOTP seed, and media each get their own AES-256-GCM blob with its own random salt. These keys are derived in the frontend (WebCrypto), because that’s where plaintext is allowed to exist.

So the work spans two runtimes. JS holds the entry and signal keys; Rust holds everything else (the SQLCipher key, media files on disk, the TOTP seed). Neither side can do the job alone.

Here’s the full dependency audit — eleven things that touch the password:

#ItemStored asAction on change
1Journal entriesper-field AES-GCM blobre-encrypt (frontend has the key)
2Signalsper-field AES-GCM blobre-encrypt (frontend)
3Media attachmentsencrypted files on diskre-encrypt (Rust)
4SQLCipher DB keypassword-derived, salt on diskrekey the whole file (Rust)
5TOTP seedper-field blobre-encrypt (Rust)
6PIN unlockwrapped copy of the passwordinvalidate + prompt
7Desktop biometricpassword in OS keyringinvalidate + prompt
8Recovery keypassword wrapped under the recovery codere-escrow or invalidate
9Verifier hashPBKDF2 hash + saltupdate
10Hardware keyindependent secret (not the password)no-op
11Old export filesself-contained envelope per exportno-op

Items 1–5 plus 9 are the irreversible re-encryption work. Items 6–8 are convenience copies that wrap the old password; the safe default is to invalidate them and prompt re-setup (more on the one exception below). Items 10–11 are genuinely independent.

The real enemy: atomicity

The hard part isn’t the crypto. AES-GCM and PBKDF2 are library calls. The hard part is that this operation is three different kinds of write with three different failure models:

  • The inner re-encryption is a SQLite transaction (atomic, fine on its own).
  • The media is loose files on a filesystem (not transactional, not atomic).
  • The outer rekey rewrites the whole DB file.

Computers lose power. Processes get SIGKILL’d. If a change is interrupted mid-flight, the unacceptable outcome is a journal that’s partly on the new password and partly on the old — a state where no single password opens all the data. In a zero-knowledge app that’s not “an error message,” that’s permanent data loss. There’s no admin key to recover with; that’s the entire premise.

So the bar is: at every instant, the data must be openable by exactly one password — old XOR new — and an interruption must always resolve to one of those two, never a mix.

The design fork

The plan sketched the obvious order: re-encrypt the inner blobs and commit them → re-encrypt media → rekey the outer SQLCipher layer. Implementing it surfaced why that order is unsafe.

The outer and inner keys both derive from the same password. So if I commit the inner blobs on the new password while the DB file is still SQLCipher-keyed with the old password, and the process dies right there, I’ve created a persistent skew: outer = old, inner = new. No single password opens it.

And I couldn’t “resume forward” out of it, either. The crash-recovery marker deliberately persists no key material (writing the password to disk to resume would defeat the zero-knowledge model). Worse, startup recovery runs before the user unlocks — there’s no password available at recovery time at all. So recovery has to be able to finish with no key.

That constraint is what produced the real design.

Single atomic flip, keyless tail. Do every operation that needs a key first, before a single commit point. Then make everything after the commit keyless, so startup recovery can finish it forward with no password.

Concretely:

  1. Stage media — for each attachment, write its new-password copy to a <file>.rekeytmp sibling. Originals untouched. (Keyed; reversible.)
  2. Build a new-keyed tmp DBmoodhaven_rekey.db, a SQLCipher database opened under the new key, with the inner blobs, TOTP, verifier, and recovery escrow all re-encrypted inside it. The live DB is still untouched. (Keyed; reversible — it’s a file off to the side.)
  3. The commit — one atomic flip: write the new salt into db_state.json and atomically rename the tmp over the live DB (Database::rekey_in_place). This is the only irreversible instant.
  4. Keyless tail — rename the staged .rekeytmp media over their originals; clear the marker. No key required for any of it.

A crash before step 3 leaves the live DB wholly on the old password; recovery discards the orphan tmp and staged media. A crash after step 3 has already committed the new password; recovery just finishes the keyless renames.

And recovery’s entire decision collapses to one comparison:

db_state.json.salt == marker.new_salt_b64
   ⇒ committed   → roll forward (finish keyless tail)
   ≠             → pre-commit  → roll back (discard tmp + staged media)

The salt is the commit record. The marker carries the new salt and per-file media progress — never key material. That’s the whole recovery protocol.

The proof

A design that “should be” crash-safe is a hypothesis. I wanted it falsified or confirmed by an actual killed process, not by reasoning.

The harness drops crash_point! markers at every phase boundary. Two layers of proof run in CI:

  1. In-process state-injection matrix (cmp_b0..b4) — exercises each boundary deterministically; runs on every OS in CI.
  2. Literal kill -9 against a real subprocess — an external signal lands at a chosen boundary, the process dies for real, then a fresh process runs recovery and asserts the invariant.

The invariant, asserted at every boundary: the data is openable by old XOR new (never both, never neither), and a sentinel row survives intact.

The exhibit:

boundary                   killed   recovered   sentinel result
encrypt.after_salt         SIGKILL  old         intact   PASS
encrypt.after_export       SIGKILL  new         intact   PASS
encrypt.after_state_true   SIGKILL  new         intact   PASS
encrypt.before_rename      SIGKILL  new         intact   PASS
encrypt.after_rename       SIGKILL  new         intact   PASS
cmp.tmp_built              SIGKILL  old         intact   PASS
cmp.after_db_flip          SIGKILL  new         intact   PASS
cmp.after_promote          SIGKILL  new         intact   PASS
→ 8/8 boundaries: data survived a kill -9 at every step

Note where “old” flips to “new”: precisely at the salt/rename commit, exactly as designed. The boundaries before it recover to the old password; the boundaries at and after it recover to the new one. No boundary recovers to a mix, because the design makes a mix unrepresentable.

Two details that make the “I actually built this” point

A design diagram doesn’t catch these. Building it does.

Sealed time-capsule entries are invisible to the normal read API. MoodHaven lets you seal an entry until a future date, and the ordinary “get entries” path withholds sealed entries by design — that’s the feature. Which means a re-key sweep built on the normal read API would skip every sealed entry, leave it encrypted under the old password, and render it permanently undecryptable after the change. I found this while wiring the frontend fetch and added a dedicated raw-blob path (get_entry_rekey_blobs) that returns all entries, sealed included, specifically for re-keying. The danger isn’t in what you re-encrypt — it’s in what you silently don’t.

A signals serialization bug in my own scaffold. Signals are stored as the full JSON.stringify(EncryptedData) envelope — { iv, data, salt }. My first cut of the change pipeline sent only the ciphertext field across the IPC boundary, dropping the per-blob iv and salt. Every signal would have been re-stored without the values needed to decrypt it — silent, total loss of the signals table. It was caught not by review but by a round-trip test: encrypt under old → re-key → assert it decrypts under new and fails under old. The test failed because there was nothing to decrypt. Write the round-trip test first; it finds the bugs your eyes approve.

The convenience-factor decision, and one exception

PIN unlock and biometric unlock store a wrapped copy of the password. After a change they wrap the wrong password. I deliberately invalidate them rather than re-wrap in-band: re-wrapping requires their secrets (the PIN itself, an OS-keyring write) inside the same atomic window, which widens the failure surface for marginal UX gain on a rare action. The post-change checklist makes re-setup a ten-second task.

The recovery key earned an exception. It wraps the password under the user’s 24-character recovery code — and if the user re-enters that code during the change, the frontend can re-wrap the new password under it (a pure transform, no settings write) and hand the blob to the backend, which installs it inside the atomic flip. So the same recovery key keeps working. Leave the field blank and the stale key is disabled and the checklist prompts regeneration. (This shipped as a fast-follow after the core feature.)

The honest coda

The “right” version of this feature doesn’t re-encrypt anything. You introduce a random master data key (MDK) that encrypts all the fields, and you store the MDK wrapped by the password-derived key — exactly the shape PIN, biometric, and recovery already use for the password. Then changing the password is a single re-wrap of one small blob: O(1), instant, trivially atomic. PIN/biometric/recovery would wrap the MDK instead of the password, so they’d survive a change untouched, too.

I shipped the honest O(data) re-encrypt first anyway, for two reasons. One, the MDK is a crypto-core change — current blobs are keyed directly by the password, so adopting it still requires a one-time full re-encryption migration, with its own versioning and crash-safety. Two — and this is the part I like — that one-time migration is an Approach-A run. The batching, the pending marker, the staged-media swap, the single atomic flip: when the MDK lands, this machinery is reused, not thrown away. The honest version isn’t a detour around the elegant version. It’s the elegant version’s migration path, built and crash-tested a release early.


Shipped in PR #155. Verified: cargo test --lib 220, crash-replay 8/8, vitest 1516, clippy/fmt/typecheck/lint clean. There’s a plain-language companion writeup on the MoodHaven site for the user-trust version of this story.