Schema Versioning
In a distributed environment, data structures, schemas, and documents organically undergo changes over time, which can lead to data inconsistencies.
In order to maintain data consistency and integrity across distributed peers, Ditto recommends implementing the following schema versioning patterns.
Design your Schema Versioning Pattern
A schema versioning pattern is a systematic approach to managing and handling changes to your schema over time by providing mechanisms for tracking and controlling your data structures as they inevitably evolve.
To ensure data consistency and reliability over time, create your own schema versioning pattern for each Ditto document.
Following is a list of various practices that you can apply in your schema versioning pattern:
Practice | Description |
---|---|
Standard naming conventions | Establish consistent naming rules. |
Forward-compatibility | Do not change types of existing fields. Introduce new fields instead. |
Validation and transformation | Ensure data validation and transformation procedures are in place. |
Upgrade notifications | Implement a system to notify users about schema upgrades. |
Backward-Compatibility
Older data could be very important, or it could not be. It’s your choice to decide what to do with these old documents: you could accept (as-is), reject (ignore), or migrate them to the new schema.
Ditto’s replication protocol is designed to be backward-compatible. Backward compatibility means that eventually you will have the “couch device problem” (i.e., a device that fell behind a couch). In other words, a device in your mesh may be offline for a significant amount of time before connecting back with other devices.
If the shape of your documents is significantly different on that device, there could be documents that do not conform with your new application code. Synchronizing with this “couch device” could cause other devices to crash unexpectedly in production if precautions aren’t taken in your application schema.
If that sounds like your application, we recommend that you use a pattern where you add a schema version to your documents. When a schema change is necessary, bump the number.
Same-Version Compatibility
Some applications do not need backward- or forward- compatibility, which can simplify their business logic significantly.
For example, you can use schema_version = {number}
as a convention to specify the
collection schema version your app will be listening to. Then, in your
application, you can be sure that you are only selecting documents that come from
schema versions that your current application code can support.
Force Upgrade
When a new application version is detected, you can stop synchronizing. You can
detect that a new application version is available by querying for
a schema_version
that is greater than the current version. If a new version is
detected, stop sync and tell the user they need to upgrade their app to the
latest version.
This is a common pattern that many applications use. For example, Apple Notes warns users that they are on an older version and will experience degraded features until they upgrade.
Forward-Compatibility
An application is forward-compatible when existing code is able to read new data. We can see forward-compatibility in web development.
In a typical centralized database like PostgreSQL, developers often focus on backward‑compatibility, where newer versions of the application can open old documents. In a distributed system, you do not have central control of all modifications to data. It is difficult and sometimes impossible to control all versions of your application that are active in production environments. Because of these constraints, you need to not only think of backward‑compatibility, but also forward‑compatibility.
To achieve forward‑compatibility of your database, you should never change the type of an existing field. In other words, developers should only ever add new fields, and never remove or modify old fields. You can ensure this by creating a model that encapsulates Ditto collections and is used across your application(s) to validate the field values and their associated types before inserting those values into the database.
Was this page helpful?