Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.ditto.live/llms.txt

Use this file to discover all available pages before exploring further.

This page covers two decisions that shape every Ditto data model:
  1. How to model relationships — when to keep related data together in one document, and when to split it across collections.
  2. How to manage document size — what counts as “too big,” and what to do about it.
For deeper guidance on how Ditto’s CRDT merge rules interact with these choices, see Conflict Resolution Patterns.

See Also

Conflict Resolution Patterns

Deeper guidance on denormalized documents, audit logs, maps vs arrays, and read-time derivation.

Document Model

How Ditto documents work internally as CRDTs, and the document size limits referenced below.

Device Storage Management

Time-to-Live (TTL) eviction strategies to manage storage in long-running deployments.

Schema Versioning

Ditto is schemaless, so additive changes flow through automatically. This guide covers the two patterns for handling breaking changes — a schema_version discriminator and separate collections — and when each applies.

Conflict Resolution Patterns

When multiple devices write to the same data concurrently, CRDT merge semantics determine the outcome. This guide covers choosing between maps and arrays, using audit logs for status fields, and applying diff-aware upserts so concurrent writes aren’t lost.

Modeling Relationships

The default in Ditto is to denormalize related data into a single document, with sub-entities stored as maps keyed by ID. A single write is atomic, the full document syncs as a unit, and Ditto’s add-wins maps merge concurrent edits to different sub-entities cleanly. See Denormalized Documents for the full pattern.

Embedding sub-entities as maps

Consider a people collection where each person owns a set of cars. The recommended pattern is to store the cars inline as a map keyed by car ID:
JSON
{
    "_id": "abc123",
    "name": "Susan",
    "age": 31,
    "cars": {
        "def456": {
            "make": "Hyundai",
            "color": "red",
            "mileage": 13000
        },
        "ghi789": {
            "make": "Jeep",
            "color": "blue",
            "mileage": 34000
        }
    }
}
Two devices can update different cars — or different fields on the same car — and Ditto merges both writes automatically. There is no special performance penalty for this nesting; CRDT maps work the same way at every level of the document.

When to split into a separate collection

Pull a sub-entity into its own collection when one of the following clearly applies:
  • Independent permission scopes. The sub-entity is owned or edited by a principal who shouldn’t have read or write access to the rest of the parent. For example, a mechanic who works on cars but should not see the owner’s other personal data.
  • Independent access at scale. The application frequently reads or writes the sub-entity without ever touching the parent. For example, a fleet view that scans cars across many owners.
  • Document size pressure. Continuing to embed the sub-entity would push the parent past the document size limits.
In a split model, each sub-entity lives in its own document with a foreign-key reference back to the parent:
JSON
{
    "_id": "def456",
    "ownerId": "abc123",
    "make": "Hyundai",
    "color": "red"
}
Splitting trades atomic single-document sync and simpler reads for the access flexibility of a separate collection. Use the embedded form unless one of the criteria above clearly applies.

Document Size

Document size doesn’t drive steady-state sync — Ditto syncs only the fields that change between peers — but it does affect local storage, memory, serialization time, and the initial replication of new documents. Ditto enforces a soft warning at 250 KB and a hard sync cap at 5 MB; see Document Size for the full mechanism. If denormalizing a sub-entity would push the parent toward these limits, that is one of the criteria for splitting it into a separate collection. For large binary blobs such as images or video, use the ATTACHMENT data type rather than embedding raw bytes.