Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.ditto.live/llms.txt

Use this file to discover all available pages before exploring further.

Ditto is schemaless: documents in a collection can have arbitrary fields, and the sync layer stores and replicates them without enforcing a schema. Because of this, additive changes — adding new fields to existing documents — typically need no versioning pattern. Old application code keeps reading the fields it already understands; how it handles unfamiliar extra fields is up to your deserializer, not Ditto itself. This guide is for breaking, non-additive changes: changing a field’s type, removing or renaming a field, or redefining the meaning of an existing field. These changes are not safe by default in a distributed mesh, because a peer running older application code may try to read a document whose schema it no longer recognizes, potentially causing crashes or incorrect behavior. Two equivalent patterns handle this safely; pick whichever fits your change.

Prefer additive changes

Before reaching for a versioning pattern, ask whether you can just add a new field instead. Instead of changing mileage from miles to kilometers, add mileage_km alongside mileage and let application code read whichever field it understands.
Never change the type, remove, or rename an existing field without a versioning pattern. Old peers still reading or writing the old schema may crash or return incorrect results — type changes on indexed fields in particular can produce wrong query results, because Ditto tracks only the most recently written data-type variant for a field. Prefer adding a new field; if the change is genuinely breaking, use one of the patterns below.
The two patterns below handle breaking changes like field renames, reshuffling, and type changes on indexed fields.

Pattern 1: schema_version in a composite _id

Add a schema_version field to each document and filter sync subscriptions on it. Each peer subscribes only to the versions its application code understands, so it receives and stores only the documents matching its subscription. Store schema_version as a subfield of the document’s composite _id. Because _id is immutable, a document’s schema version can’t be accidentally changed by an UPDATE, and a new-version document is a new record — not a modified version of the old one. For example, suppose v1 documents stored fuel economy under an ambiguously named mpg field:
DQL
INSERT INTO cars DOCUMENTS ({
  _id: { id: 'car-001', schema_version: 1 },
  make: 'Hyundai',
  color: 'red',
  mpg: 32
})
When the application is updated to rename that field for clarity, v2 documents are written with the new schema and a bumped schema version:
DQL
INSERT INTO cars DOCUMENTS ({
  _id: { id: 'car-001', schema_version: 2 },
  make: 'Hyundai',
  color: 'red',
  fuel_economy_mpg: 32
})
These two documents coexist in the cars collection: because their composite _id values differ, the v2 document is a new record, not a modified version of the v1 one. Each application version registers a sync subscription scoped to the schema versions it understands:
DQL
SELECT * FROM cars WHERE _id.schema_version = 2
Because Ditto applies the subscription filter at sync time, a peer only receives documents matching its subscription. Read queries and observers don’t need to re-apply the filter. Your application code still needs a model type per schema version it supports. A peer subscribed to a single version deserializes results into that version’s model — e.g. CarV2 with fuel_economy_mpg. A peer subscribed to both versions (during a rollout) sees both schemas and must convert each to the right model based on _id.schema_version. Ditto doesn’t map between versions for you.

Pattern 2: Separate collections per breaking version

Alternatively, put the new schema in its own collection, for example cars → cars_v2. Because Ditto sync subscriptions are per-collection, a peer that hasn’t subscribed to cars_v2 simply doesn’t receive documents from it. Pattern 2 doesn’t need schema_version in the _id — the collection name already distinguishes versions, so use whatever _id structure suits the collection otherwise.
DQL
INSERT INTO cars_v2 DOCUMENTS ({
  _id: 'car-001',
  make: 'Hyundai',
  color: 'red',
  fuel_economy: { value: 32, unit: 'mpg' }
})
DQL
SELECT * FROM cars_v2
Existing documents in the original cars collection continue to sync for peers still on the old application version; they phase out as described below.

Comparing the two patterns

Both patterns are valid choices for any breaking change. They differ in structure, not in preference:
Pattern 1: schema_version discriminatorPattern 2: Separate collections
Collection nameStays the sameNew collection per version
_id structureComposite, with schema_version subfieldWhatever fits the collection otherwise
Subscription filterWHERE _id.schema_version = NFROM cars_v2
Cross-collection referencesUnchangedForeign-key references into the versioned collection need updating
Indexing during the bridge phaseA type change on an indexed field can produce wrong query results while both schemas coexist in a peer’s local store (see Indexing)Each version’s index is isolated in its own collection — no mixing
Pick whichever fits the change you’re making. There’s one case where the choice matters: if you’re changing the type of an indexed field, separate collections avoid a rollout hazard that the schema_version approach can’t.

Rolling out the new version

Because each peer subscribes only to the versions its application code understands, you cannot start writing new-version documents until every deployed app version can read them. A v2 write made while some users are still on a v1-only subscription stays invisible to those users until they upgrade — they keep reading and editing the v1 copy of the data, while the v2 copy accumulates separately in the mesh. Roll out a new schema version in these phases:
  1. Ship a bridge version of the app that reads both versions. Register sync subscriptions for both v1 and v2 — two subscriptions in Pattern 1 (one per _id.schema_version), or one per collection in Pattern 2. Continue to write the old schema. Read paths handle either schema.
  2. Wait until the bridge version has reached every deployed device. Until it does, any v2 write will be invisible to users still on the older app version.
  3. Ship the writer version. Once every peer can read v2, start writing v2 documents. Reads still handle both schemas during the transition.
  4. Drop v1 support. Once old-schema documents have drained from the mesh (see the next section), remove the v1 subscription and the v1 model type.

Phasing out the old schema

Step 4 of the rollout above — “drop v1 support” — depends on old-schema documents no longer showing up in the mesh. They don’t delete themselves; you have to actively remove them. Ditto gives you two DQL operations for this:
  • EVICT — removes documents from the local peer only. Use it to reclaim space on a device once its user has upgraded past v1; other peers are unaffected.
  • DELETE — removes documents globally. DELETE creates a tombstone that propagates via sync, so when a late-arriving offline device reconnects, it applies the deletion too.
A common pattern is TTL-style cleanup: timestamp your documents on write, then periodically run an EVICT or DELETE query matching documents older than some age. Once old-schema documents have been cleaned up, remove the v1 subscription and the v1 model type from the app. Don’t try to force the transition with a backfill. A backfill — reading every v1 document, writing a v2 equivalent, and deleting the v1 — can’t catch documents on offline devices. When those devices reconnect after the backfill “completes,” their local v1 documents sync into the mesh and re-introduce the old schema. You still need the cleanup queries above to handle late-arriving docs, so the backfill is extra work for no benefit.

See also

  • Types and Definitions — DQL collection definitions and CRDT type annotations
  • Indexing — index behavior, subfield indexing, and the data-type-variant caveat
  • Strict Mode — how CRDT type defaulting interacts with heterogeneous documents
  • Data Modeling Tips — flat vs. embedded models and composite _id patterns