> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ditto.live/llms.txt
> Use this file to discover all available pages before exploring further.

# Schema Versioning

> Ditto is a schemaless document store, so additive changes flow through automatically. Use these patterns only when you need to make a breaking change to a document's schema.

Ditto is schemaless: documents in a collection can have arbitrary fields, and the sync layer stores and replicates them without enforcing a schema. Because of this, **additive changes — adding new fields to existing documents — typically need no versioning pattern**. Old application code keeps reading the fields it already understands; how it handles unfamiliar extra fields is up to your deserializer, not Ditto itself.

This guide is for **breaking, non-additive changes**: changing a field's type, removing or renaming a field, or redefining the meaning of an existing field. These changes are not safe by default in a distributed mesh, because a peer running older application code may try to read a document whose schema it no longer recognizes, potentially causing crashes or incorrect behavior. Two equivalent patterns handle this safely; pick whichever fits your change.

## Prefer additive changes

Before reaching for a versioning pattern, ask whether you can just add a new field instead. Instead of changing `mileage` from miles to kilometers, add `mileage_km` alongside `mileage` and let application code read whichever field it understands.

<Warning>
  **Never change the type, remove, or rename an existing field without a versioning pattern.** Old peers still reading or writing the old schema may crash or return incorrect results — type changes on indexed fields in particular can produce wrong query results, because Ditto tracks only the most recently written data-type variant for a field. Prefer adding a new field; if the change is genuinely breaking, use one of the patterns below.
</Warning>

The two patterns below handle breaking changes like field renames, reshuffling, and type changes on indexed fields.

## Pattern 1: `schema_version` in a composite `_id`

Add a `schema_version` field to each document and filter [sync subscriptions](/sdk/latest/sync/syncing-data) on it. Each peer subscribes only to the versions its application code understands, so it receives and stores only the documents matching its subscription.

Store `schema_version` as a subfield of the document's [composite `_id`](/best-practices/data-modeling). Because `_id` is immutable, a document's schema version can't be accidentally changed by an UPDATE, and a new-version document is a new record — not a modified version of the old one.

For example, suppose v1 documents stored fuel economy under an ambiguously named `mpg` field:

```sql DQL theme={null}
INSERT INTO cars DOCUMENTS ({
  _id: { id: 'car-001', schema_version: 1 },
  make: 'Hyundai',
  color: 'red',
  mpg: 32
})
```

When the application is updated to rename that field for clarity, v2 documents are written with the new schema and a bumped schema version:

```sql DQL theme={null}
INSERT INTO cars DOCUMENTS ({
  _id: { id: 'car-001', schema_version: 2 },
  make: 'Hyundai',
  color: 'red',
  fuel_economy_mpg: 32
})
```

These two documents coexist in the `cars` collection: because their composite `_id` values differ, the v2 document is a new record, not a modified version of the v1 one.

Each application version registers a sync subscription scoped to the schema versions it understands:

```sql DQL theme={null}
SELECT * FROM cars WHERE _id.schema_version = 2
```

Because Ditto applies the subscription filter at sync time, a peer only receives documents matching its subscription. Read queries and observers don't need to re-apply the filter.

Your application code still needs a model type per schema version it supports. A peer subscribed to a single version deserializes results into that version's model — e.g. `CarV2` with `fuel_economy_mpg`. A peer subscribed to both versions (during a [rollout](#rolling-out-the-new-version)) sees both schemas and must convert each to the right model based on `_id.schema_version`. Ditto doesn't map between versions for you.

## Pattern 2: Separate collections per breaking version

Alternatively, put the new schema in its own collection, for example `cars` → `cars_v2`. Because Ditto sync subscriptions are per-collection, a peer that hasn't subscribed to `cars_v2` simply doesn't receive documents from it.

Pattern 2 doesn't need `schema_version` in the `_id` — the collection name already distinguishes versions, so use whatever `_id` structure suits the collection otherwise.

```sql DQL theme={null}
INSERT INTO cars_v2 DOCUMENTS ({
  _id: 'car-001',
  make: 'Hyundai',
  color: 'red',
  fuel_economy: { value: 32, unit: 'mpg' }
})
```

```sql DQL theme={null}
SELECT * FROM cars_v2
```

Existing documents in the original `cars` collection continue to sync for peers still on the old application version; they phase out as described below.

## Comparing the two patterns

Both patterns are valid choices for any breaking change. They differ in structure, not in preference:

|                                  | **Pattern 1: `schema_version` discriminator**                                                                                                        | **Pattern 2: Separate collections**                                |
| -------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| Collection name                  | Stays the same                                                                                                                                       | New collection per version                                         |
| `_id` structure                  | Composite, with `schema_version` subfield                                                                                                            | Whatever fits the collection otherwise                             |
| Subscription filter              | `WHERE _id.schema_version = N`                                                                                                                       | `FROM cars_v2`                                                     |
| Cross-collection references      | Unchanged                                                                                                                                            | Foreign-key references into the versioned collection need updating |
| Indexing during the bridge phase | A type change on an indexed field can produce wrong query results while both schemas coexist in a peer's local store (see [Indexing](/dql/indexing)) | Each version's index is isolated in its own collection — no mixing |

Pick whichever fits the change you're making. There's one case where the choice matters: if you're changing the type of an indexed field, separate collections avoid a rollout hazard that the `schema_version` approach can't.

## Rolling out the new version

Because each peer subscribes only to the versions its application code understands, **you cannot start writing new-version documents until every deployed app version can read them**. A v2 write made while some users are still on a v1-only subscription stays invisible to those users until they upgrade — they keep reading and editing the v1 copy of the data, while the v2 copy accumulates separately in the mesh.

Roll out a new schema version in these phases:

1. **Ship a bridge version of the app that reads both versions.** Register sync subscriptions for both v1 and v2 — two subscriptions in Pattern 1 (one per `_id.schema_version`), or one per collection in Pattern 2. Continue to write the old schema. Read paths handle either schema.
2. **Wait until the bridge version has reached every deployed device.** Until it does, any v2 write will be invisible to users still on the older app version.
3. **Ship the writer version.** Once every peer can read v2, start writing v2 documents. Reads still handle both schemas during the transition.
4. **Drop v1 support.** Once old-schema documents have drained from the mesh (see the next section), remove the v1 subscription and the v1 model type.

## Phasing out the old schema

Step 4 of the rollout above — "drop v1 support" — depends on old-schema documents no longer showing up in the mesh. They don't delete themselves; you have to actively remove them.

Ditto gives you two DQL operations for this:

* [**`EVICT`**](/dql/evict) — removes documents from the local peer only. Use it to reclaim space on a device once its user has upgraded past v1; other peers are unaffected.
* [**`DELETE`**](/dql/delete) — removes documents globally. `DELETE` creates a tombstone that propagates via sync, so when a late-arriving offline device reconnects, it applies the deletion too.

A common pattern is TTL-style cleanup: timestamp your documents on write, then periodically run an `EVICT` or `DELETE` query matching documents older than some age.

Once old-schema documents have been cleaned up, remove the v1 subscription and the v1 model type from the app.

**Don't try to force the transition with a backfill.** A backfill — reading every v1 document, writing a v2 equivalent, and deleting the v1 — can't catch documents on offline devices. When those devices reconnect after the backfill "completes," their local v1 documents sync into the mesh and re-introduce the old schema. You still need the cleanup queries above to handle late-arriving docs, so the backfill is extra work for no benefit.

## See also

* [Types and Definitions](/dql/types-and-definitions) — DQL collection definitions and CRDT type annotations
* [Indexing](/dql/indexing) — index behavior, subfield indexing, and the data-type-variant caveat
* [Strict Mode](/dql/strict-mode) — how CRDT type defaulting interacts with heterogeneous documents
* [Data Modeling Tips](/best-practices/data-modeling) — flat vs. embedded models and composite `_id` patterns
