Sync and Replication Concepts
The realtime end-user transactional data generated by your app can be disseminated across devices through three different methods:
- Centralized — A Small Peer device establishes a direct internet-enabled connection to the Big Peer cloud deployment.
- Decentralized — A Small Peer device establishes a mesh network connection with nearby Small Peer devices using any and all communication transport types available to them by default.
- Hybrid — If one or multiple Small Peer devices connected in the mesh network gain access to the internet, not only do those devices with internet access upload their local database to the Big Peer cloud, but each and every nearby offline Small Peer device as well.
For instructions on enabling and disabling transport types, see Configuring Transports.
Data sync involves bidirectional data transmission with the Big Peer cloud deployment, where the Big Peer both sends and receives updates from connected peers.
As opposed to data replication, in which data is partitioned into smaller, more manageable pieces, that are then unidirectionally mirrored across the mesh network.
Syncing large documents can significantly impact network performance:
Caution is advised when handling large binary data, such as a high-resolution image or video exceeding 50 megapixels; a deeply-embedded document; or a very large document.
Instead of storing files exceeding 250kb directly within a document object, carefully consider using attachments . For more information, see Attachment Objects.
The following graphic illustrates the bidirectional sync process and the unidirectional data egress replication process:
The Big Peer supports both bidirectional sync and unidirectional replication.
Before you can establish a peer-to-peer mesh network and exchange data directly with nearby peers, you must connect to the Big Peer at least once to set up your communication infrastructure.
When you connect to the Big Peer for the first time, you obtain your access credentials, and ensure you have the latest data and are up-to-date with the rest of the peers. For more information, see Authentication.
Any Small Peer that has access to the internet automatically establishes a connection with the Big Peer to sync data across the Big Peer and, by way of multihop replication, all connected Small Peers. For more information, see Multihop Replication, as follows.
By default, devices with the same app ID automatically form a mesh network to connect with one another.
However, you can streamline replication processes, minimize unnecessary data transfer, and optimize resource usage by configuring distinct groups.
For instructions on setting up sync groups, see Transport Configurations > Sync Groups.
The multiple replicas that result from a single partition are spread across virtual and relevant physical server nodes; that is, the Big Peer cloud deployment and local Ditto stores of subscribing Small Peers.
By default, Ditto does not automatically replicate data to peers. Instead, Small Peers use subscriptions and live queries to precisely indicate the data they're interested in. Once that data changes, only the delta matching their live query replicates to their local Ditto store.
In simpler terms, replication involves a subscribing Small Peer selectively "pulling" data from peers, rather than remote Small Peers automatically "pushing" data to them.
As illustrated in the following graphic, if a Small Peer is not subscribed to any queries, no documents replicate to them:
If at any time a Small Peer connected in the mesh gains access to the internet, all mesh-generated data gets automatically exchanged with the Big Peer by way of multihop replication.
Multihop replication is the process of passing data from one connected Small Peer to another connected Small Peer by way of intermediate “hops” along a given path.
For multihop replication to occur, all peers within the chain must observe the same data, as demonstrated in the following graphic:
With Ditto's multihop replication technology, you achieve three significant advantages:
- Guarantee a consistent offline-first sync of critical data, ensuring data remains consistent event when connectivity is intermittent.
- Enhance the overall responsiveness of your app, resulting in faster interactions.
- Reduce costs by optimizing data operations since, when feasible, you can read and write data to a nearby peer instead of traversing a very fragile, low‑bandwidth data source.
The multihop replication process consists of the following:
- First, a peer discovers which nearby peers are reachable. The Ditto router continuously responds to changes in the local mesh, such as peer disconnects and transport type limitations.
- When a peer establishes a mesh network connection with another peer it first establishes encrypted internal TLS tunnels, or Ditto Links, that nearby peers use for transmission.
Similar in function to a virtual private network (VPN), Ditto links are virtual private connections that utilize the Noise protocol to establish secure and protected connections.
The Noise Protocol is a cryptographic framework for advanced security protections. For more information, see the official documentation at noiseprotocol.org.
A challenge arises in offline scenarios when two or more peers make edits independently and the data values stored by each peer diverge over time.
Referred to as conflict resolution, Ditto's process of addressing concurrency conflicts involves a combination of a metadata database and guiding principles.
Each peer individually maintains its own metadata database. The metadata database serves as a storage repository of information essential in resolving conflicts that arise during the process of merging concurrent changes.
Within this metadata database is the version vector. The version vector manages the following essential details for a particular peer:
- The current state of replication.
- A record of the data previously sent and received.
- The sequence of updates.
- The timestamp of last write operation.
Ditto adheres to the following strategies to ensure that all peers ultimately reach the same value for a particular data item:
- Deterministic — As part of the strategy of eventual consistency, regardless of the order in which updates from different peers are received and merged, all peers ultimately converge to the same single value.
- Predictable and Meaningful — Instead of arbitrarily resolving conflicting registers to a predefined value, the resulting merge accurately represents the original input and some rational interpretation of that input.
The following scenario provides a walkthrough of the mechanics of version vectors, their role in determining merging behavior, and how different peers contribute to data replication:
The HLC uses the UInt128 data type to represent the Site_ID and 64bit timestamp in Ditto; however, for simplicity, the following scenario uses basic string and number types instead.
Local Peer A document: 123abc links to a version vector that indicates:
- Its locally-stored document is currently at vector version 5.
- The most recent incoming Remote Peer B changes were incorporated and merged at version 1.
- The Remote Peer C changes were incorporated and merged at vector version 4.
Local Peer A receives document changes from remote Peer B. The incoming document's version vector indicates:
- The remote Peer B incoming version vector is value 4.
- The most recent incoming Remote Peer B changes were of version 1; a value less than the incoming document version vector value of 4.
Since 4 is greater than 1, the local Peer A determines that the changes are new and should be incorporated and merged in order to remain consistent.
Each document in each peer contains a hidden metadata map of a Site_ID and a HLC. The HLC stands for a hybrid logical clock. This HLC is used to determine whether a change has "happened before".
It might be tempting to use physical clocks to resolve conflicts when attempting to merge concurrent data changes. However, it's essential to know that even quartz-crystal-based physical clocks can skew forwards or backward in time. Almost every device regularly attempts to synchronize with an NTP-synchronized clock server. But even then, the round trip time from the request to the server's response adds additional variability. In addition, there are limitations to nature and physics that will never allow two measurements of physical time to align precisely. Thus, these conditions led us to determine that physical clocks were not reliable in a distributed mesh network.
Although we decided that we could not build a system that resolved conflicts based purely on physical time, we needed to preserve the notion of physical time as not to confuse users of collaborative applications. However, each peer still needs a deterministic way to resolve conflicts. In other words, each peer when sharing CRDT deltas needs to always resolve conflicts exactly the same way. This requirement still needs logical ordering. This requirement led us to implement the version vector with a Hybrid Logical Clock (often referred to as HLC).
In Ditto's distributed system, the HLC is a 64-bit timestamp comprised of 48 bits of a physical timestamp and 16 bits of a monotonically increasing logical clock.
Ditto requires that all devices participating in the mesh have the time and date configured accurately. It is recommended to synchronize the clock automatically with an Internet time server. This feature is built into most platforms including iOS and Android. This will be sufficiently accurate for Ditto’s needs.
If a device has a significantly incorrect clock, outcomes may include:
- Failing to connect to a peer because it thinks their certificate isn’t valid yet or has already expired.
- Unintuitive conflict resolution in certain situations when concurrently editing documents.
Each peer has its own version of the documents that multiple peers can work on concurrently. They keep track of their respective document changes in a data structure known as a version vector.
The version vectors that each document links to contain a snapshot of essential metadata, such as the given document's current state, as well as a record of the data previously sent and received.
When a peer directly updates its own locally-stored document, the document version vector increments by a value of one.
Local peers use document version vectors to accurately assess local and observed edits and ensure data consistency:
- Before a peer incorporates and merges incoming changes into its document state, the peer compares its existing version vector value with the receiving version vector value to determine whether the incoming changes are new (or old) relative to its current document state.
- If the incoming version vector is a value greater than the existing version vector value, the incoming changes are new and must be incorporated and merged in order for its document state to remain consistent.
The version vector pinpoints the chronological order of events and determines the exact timing of the most recent write operation for a specific peer utilizing a combination of temporal timestamp and the Hybrid Logical Clock (HLC):
- The temporal timestamp is a component associated with an individual peer that tracks the chronological order of its local events.
- The HLC is a higher-level component that combines both physical time and logical clocks to create a unified timestamp that tracks the chronological order of events for the entire distributed system.