> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ditto.live/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Modeling Tips

> How to structure your documents for Ditto's peer-to-peer sync — balancing single-document atomicity against document size.

This page covers two decisions that shape every Ditto data model:

1. **How to model relationships** — when to keep related data together in one document, and when to split it across collections.
2. **How to manage document size** — what counts as "too big," and what to do about it.

For deeper guidance on how Ditto's CRDT merge rules interact with these choices, see [Conflict Resolution Patterns](/best-practices/conflict-resolution-patterns).

## See Also

<CardGroup>
  <Card title="Conflict Resolution Patterns" icon="code-merge" iconType="solid" href="/best-practices/conflict-resolution-patterns">
    Deeper guidance on denormalized documents, audit logs, maps vs arrays, and read-time derivation.
  </Card>

  <Card title="Document Model" icon="circle-nodes" iconType="solid" href="/key-concepts/document-model">
    How Ditto documents work internally as CRDTs, and the document size limits referenced below.
  </Card>

  <Card title="Device Storage Management" icon="hourglass-half" iconType="solid" href="/sdk/latest/sync/device-storage-management">
    Time-to-Live (TTL) eviction strategies to manage storage in long-running deployments.
  </Card>

  <Card title="Schema Versioning" icon="code-branch" iconType="solid" href="/best-practices/schema-versioning">
    Ditto is schemaless, so additive changes flow through automatically. This guide covers the two patterns for handling breaking changes — a `schema_version` discriminator and separate collections — and when each applies.
  </Card>

  <Card title="Conflict Resolution Patterns" icon="code-merge" iconType="solid" href="/best-practices/conflict-resolution-patterns">
    When multiple devices write to the same data concurrently, CRDT merge semantics determine the outcome. This guide covers choosing between maps and arrays, using audit logs for status fields, and applying diff-aware upserts so concurrent writes aren't lost.
  </Card>
</CardGroup>

## Modeling Relationships

The default in Ditto is to **denormalize related data into a single document**, with sub-entities stored as maps keyed by ID. A single write is atomic, the full document syncs as a unit, and Ditto's add-wins maps merge concurrent edits to different sub-entities cleanly. See [Denormalized Documents](/best-practices/conflict-resolution-patterns#denormalized-documents-one-document-atomic-sync) for the full pattern.

### Embedding sub-entities as maps

Consider a `people` collection where each person owns a set of cars. The recommended pattern is to store the cars inline as a map keyed by car ID:

```json JSON theme={null}
{
    "_id": "abc123",
    "name": "Susan",
    "age": 31,
    "cars": {
        "def456": {
            "make": "Hyundai",
            "color": "red",
            "mileage": 13000
        },
        "ghi789": {
            "make": "Jeep",
            "color": "blue",
            "mileage": 34000
        }
    }
}
```

Two devices can update different cars — or different fields on the same car — and Ditto merges both writes automatically. There is no special performance penalty for this nesting; CRDT maps work the same way at every level of the document.

### When to split into a separate collection

Pull a sub-entity into its own collection when one of the following clearly applies:

* **Independent permission scopes.** The sub-entity is owned or edited by a principal who shouldn't have read or write access to the rest of the parent. For example, a mechanic who works on cars but should not see the owner's other personal data.
* **Independent access at scale.** The application frequently reads or writes the sub-entity without ever touching the parent. For example, a fleet view that scans cars across many owners.
* **Document size pressure.** Continuing to embed the sub-entity would push the parent past the [document size limits](/key-concepts/document-model#document-size).

In a split model, each sub-entity lives in its own document with a foreign-key reference back to the parent:

```json JSON theme={null}
{
    "_id": "def456",
    "ownerId": "abc123",
    "make": "Hyundai",
    "color": "red"
}
```

Splitting trades atomic single-document sync and simpler reads for the access flexibility of a separate collection. Use the embedded form unless one of the criteria above clearly applies.

## Document Size

Document size doesn't drive steady-state sync — Ditto syncs only the fields that change between peers — but it does affect local storage, memory, serialization time, CRDT merge cost, and the initial replication of new documents. Ditto warns at a soft threshold of 256 KiB and logs an error at a hard limit of 5 MiB; future versions will reject and stop replicating documents that exceed the hard limit. See [Document Size Limits](/best-practices/document-size-limits) for monitoring, remediation, and configuration, or [Document Size](/key-concepts/document-model#document-size) for the conceptual overview.

If denormalizing a sub-entity would push the parent toward these limits, that is one of the [criteria for splitting it into a separate collection](#when-to-split-into-a-separate-collection). For large binary blobs such as images or video, use the [`ATTACHMENT`](/sdk/latest/crud/working-with-attachments) data type rather than embedding raw bytes.
