Schema Versioning
In a distributed environment, data structures, schemas, and documents organically undergo changes over time, which can lead to data inconsistencies.
In order to maintain data consistency and integrity throughout these changes, Ditto offers built-in schema versioning patterns and tools that you can incorporate into your workflow to make changes to data structures more effectively, adapt to new requirements, and ensure reliable and consistent data replication across distributed peers.
This topic provides strategies you can implement in your app to handle changes in schema and versioning:
Changing your schema is inevitable. To ensure reliability over time, create your own schema versioning pattern for each Ditto document.
A schema versioning pattern is a systematic approach to managing and handling changes to your schema over time by providing mechanisms for tracking and controlling your data structures as they inevitably evolve.
To ensure data consistency and reliability over time, create your own schema versioning pattern for each Ditto document.
Following is a list of various practices that you can apply in your schema versioning pattern:
Practice | Description |
---|---|
Standard naming conventions | Establish consistent naming rules for elements in your schema. |
Field modifications | Handle changes to existing fields effectively. |
Field Types | Manage the types of fields within your schema. |
Validation and transformation | Ensure data validation and transformation procedures are in place. |
Upgrade notifications | Implement a system to notify users about schema upgrades. |
Ditto's replication protocol is designed to be backward-compatible. Backward compatibility means that eventually you will have the "couch device problem" (i.e., a device that fell behind a couch). In other words, a device in your mesh may be offline for a significant amount of time before connecting back with other devices. If the shape of your documents is significantly different on that device, there could be documents that do not conform with your new application code. Synchronizing with this "couch device" could cause other devices to crash unexpectedly in production if precautions aren't taken in your application schema.
Some applications do not need backward- or forward- compatibility, which can simplify their business logic significantly. If that sounds like your application, we recommend that you use a pattern where you change the name of the collection for each schema version of your application. This enforces further that field types never change.
For example, you can use myCollection_v{number} as a convention to specify the collection schema version your app will be listening to. When a schema change is necessary, bump the number.
Collections are very cheap to create in Ditto, so this will scale even for applications that run for many years.
You could also only synchronize documents that come from schema versions that are the same as your current schema version.
In a typical centralized database like PostgreSQL, developers often focus on backward‑compatibility, where newer versions of the application can open old documents. In a distributed system, you do not have central control of all modifications to data. In an offline peer-to-peer mesh, it is difficult and sometimes impossible to control all versions of your application that are active in production environments. Because of these constraints, you need to not only think of backward‑compatibility, but also forward‑compatibility.
An application is forward-compatible when existing code is able to read new data. We can see forward-compatibility in web development.
To achieve forward‑compatibility of your database, you should never change the type of an existing field. In other words, developers should only ever add new fields, and never remove or modify old fields. You can ensure this by creating a controller that encapsulates Ditto and is used across your application(s) to validate the field values and their associated types before upserting those values into the database.
Older data could be very important, or it could not be. It's your choice to decide what to do with these old documents: you could accept (as-is), reject (ignore), or migrate them to the new schema.
For example, here's a breaking version change where we add a new field and change the type of an old field:
You also may want to ignore documents that come from incompatible applications.
Supporting the Latest Version
When a new application version is detected, you can stop synchronizing. You can detect that a new application version is available by querying for a _schemaVersion that is greater than the current version. If a new version is detected, stop sync and tell the user they need to upgrade their app to the latest version.
This is a common pattern that many applications use. For example, Apple Notes warns users that they are on an older version and will experience degraded features until they upgrade.