Chapter 2 gave us a complete data model — immutable, content-addressed values with O(1) metadata and clean APIs. But those values live in a cache (a plain Clojure map in an atom). What happens when you need persistence, distribution, or lazy loading across machines?
This chapter adds stores — the persistence layer. A store is a content-addressed key-value system where keys are hashes and values are serialized nodes. Stores compose hierarchically: memory → disk → peers → origin server. Reads walk layers top-down; writes propagate everywhere.
Every store implements a minimal protocol:
get(hash) → Value | nil
put(hash, value) → Store // idempotent
has?(hash) → bool
snapshot() → {hash: value}
merge({hash: value}) → Store // bulk insert
reset() → Store // clear all
put is idempotent — storing the same content twice is a no-op. Values are serialized bytes; the hash is computed from the logical content, not the serialized form (Chapter 2).
This protocol is language-agnostic. Clojure, Rust, TypeScript — all implement the same six functions.
An in-memory atom over a map: {hash → serialized-value}. Fast reads/writes, ephemeral. Default for testing and construction.
A MemStore’s internal atom is shared directly with the Layer 2 cache (§3.3), so value constructors and store operations see the same data with zero synchronization overhead.
Content-addressed filesystem with directory sharding:
base/ab/cd/abcdef....edn
Each hash maps to a two-level directory structure. Values stored as EDN. Durable, but slower than memory.
LMDB-backed persistent store with optional meta database:
hash → serialized-value (32-byte keys, EDN values)string-key → value (for root hashes, metadata)Supports configurable max size, database names. Requires explicit lmdb-close when done.
Composes stores with read-through semantics:
(layered-store (mem-store) (lmdb-store "/tmp/dacite"))
A remote peer slots in naturally — local layers cache remote fetches automatically.
Layer 2 values operate on a cache — a dynamic var *cache* holding an atom over a plain map. Layer 3 bridges stores to this cache so that value operations and store operations stay in sync.
The convenience functions in the store namespace write to both cache and store:
get-store checks the cache first, falls through to the store on miss, and populates the cache on hitput-store! writes to both cache and storemerge-store! merges into bothFor MemStores, the bridge is zero-overhead: the store’s internal atom is the cache atom. No copying, no synchronization. For other store types (File, LMDB, Layered), a separate cache atom is created from a snapshot.
Any code that rebinds *store* must also rebind *cache* to keep them in sync. Two macros handle this:
bind-store — binds both *store* and *cache* for the duration of body:
(store/bind-store my-store
(d/hash-map "key" "value"))
with-store — creates an isolated store context, returns [snapshot result]:
(store/with-store [s (mem-store)]
(d/hash-map "key" "value"))
;; => [{hash1 [...], hash2 [...], ...} <DaciteMap>]
Never use (binding [store/*store* ...]) directly — use bind-store or with-store instead.
Stores hold serialized values. Dacite defines two formats:
Authoritative for hashing/storage. Deterministic, compact, streaming.
node = kind-tag (1 byte) + fields
Scalar: 0x00 + u8(type-len) + type-bytes + u8(val-len) + val-bytes
Seq Node: 0x01 + u8(subtype) + measure (48 bytes) + u8(n-children) + hashes[n]
Map Node: 0x02 + u8(subtype) + type-specific fields
Collection Header: 0x03 + u8(type) + root-hash + u64(count) + u64(size_bytes)
Measures are 48 bytes: u64(count) + u64(size_bytes) + hash(32 bytes).
Nodes fit in ~1 KB. No unbounded structures.
See Appendix: Serialization for the complete binary format specification.
Round-trips through clj->dac / dac->clj and Cheshire. Preserves hashes through the value layer’s content-addressing.
Not yet implemented. This section describes the target design.
Immutable hashes enable perfect caching — no invalidation needed.
Server uses size_bytes to choose response mode:
GET /node/{hash}?inline_under=1024&leaf_chunk=4096
| Condition | Response |
|---|---|
size_bytes ≤ inline_under |
Inline scalars |
| else | Structure (hashes only) |
| uniform scalar leaves | Coalesced chunks |
Client controls thresholds. Blobs/strings fetch as single chunks.
Stores layer as: local-mem → local-disk → peers → origin. Peers discover via root hashes. No central index — hashes are the index.
Not yet implemented. This section describes the target design.
Stores are caches at every layer. Evict freely — immutable data re-fetches identically.
Root pinning: Mark roots non-evictable (reachable nodes protected).
Purge: Delete root. Orphans evict naturally. Shared nodes survive.
| Function | Signature | Description |
|---|---|---|
get |
hash → Value\|nil |
Fetch serialized value |
put |
(hash, Value) → Store |
Store (idempotent) |
has? |
hash → bool |
Exists? |
snapshot |
→ {hash: Value} |
All entries |
merge |
{hash: Value} → Store |
Bulk insert |
reset |
→ Store |
Clear |
| Function | Description |
|---|---|
get-store |
Cache-first lookup, falls through to store |
put-store! |
Write to both cache and store |
merge-store! |
Bulk write to both |
snapshot-store |
Cache snapshot |
bind-store |
Bind *store* + *cache* together |
with-store |
Isolated store context, returns [snapshot result] |
| Function | Description |
|---|---|
mem-store |
In-memory atom-backed store |
file-store |
Filesystem with directory sharding |
lmdb-store |
LMDB-backed persistent store |
layered-store |
Compose stores with read-through |
| Function | Description |
|---|---|
serialize |
Value → canonical bytes |
deserialize |
Bytes → value |
json->dacite |
JSON string → Dacite value |
dacite->json |
Dacite value → JSON string |
put(h, v); get(h) = vput(h, v); put(h, v) idempotentDepends on Layers 1–2. First layer with I/O and state.
Chapter 4 adds authorization: proof of possession and authenticated access.