Data Types
The conflict-free replicated data type (CRDT) data structure enables smooth peer‑to‑peer offline replication and online sync in distributed systems.
When multiple offline Ditto stores make concurrent updates to the same CRDT object, the CRDT merge mechanism remediates the data collision so that, when sync takes place, they converge to a consistent state. For more information, see the Platform Manual > Sync and Replication Concepts.
This topic provides a catalog of the types of CRDTs you can use in Ditto:
To start building your mental model, think of a CRDT as a container holding two essential components: the actual data to store and some metadata that helps in resolving conflicts.
The register type holds the actual data to be stored; each set of fields in a document or key‑value pair embedded in a map function as a single register object.
For example, the following snippet consists of three separate registers:
Field Property | Value | CRDT | |
name | 'Frank' | register(string) | |
age | 31 | register(number) | |
ownedCars | 0 | counter (number) | |
Since each set of fields in a document and, if embedded within a map, each key-value pair act as a single register, you can structure, organize, and manage your data more efficiently.
You can encode simple data as well as highly-complex datasets in a register:
- To encode simple data in a registers, use primitive JSON-compatible types, such as string, boolean, number, and so on.
- To encode more complex datasets in a register, use a combination of primitive types and register and maps.
The following table provides an overview of the types you can use to store a single value in a register:
Data Type | Corresponding Values |
array | An ordered list of values, where each value represents any primitive type, as well as nested collection types such as other arrays or maps |
boolean | true or false |
float | 64-bit Floating-Point |
map | Embedded document object |
null | Represents an absence of a value |
int | Signed 64-bit integer |
string | UTF-8 encodable |
unsigned int | Unsigned 64-bit integer |
Each register adheres to the last-writer-wins principle at merge, ensuring conflict-free replication and data consistency throughout the distributed system.
With last-writer-wins merge semantics, when conflicting updates occur, the update made by the last writer always takes precedence and propagates across peers as the single source of truth — the definitive value.
An extension to the register type, an array is an ordered collection of items represented using any primitive JSON‑compatible data type enclosed within square brackets ([ ]).
Avoid using arrays in Ditto.
Due to potential merge conflicts when offline peers reconnect to the mesh and attempt to sync their updates, especially when multiple peers make concurrent updates to the same item within the array.
CRDT-based arrays, when compared to typical arrays, have certain strengths and limitations.
While CRDTs excel at managing unordered operations, maintaining a strict order is less efficient, which could potentially lead to data inconsistencies at merge.
The reason for this is because, unlike traditional arrays, the multiple elements within a CRDT-based arrayfunction collectively as a singleregister, and, as previously explained in Ditto Basics, a register is effectively a container that holds multiple elements — the actual data and various metadata — just like regular arrays.
In simple terms, imagine an arraylike a string. Much like a string, each element in a CRDT-based array plays a unique role in determining the final value that ultimately forms; just like the string characters that eventually form a word.
And similar to how you access elements in a typical array, you reference each individual element in a CRDT-based array (a register) by their index. If those indexes change while disconnected from the network, once network conditions improve, they may fail to merge due to concurrency conflicts, resulting in data inconsistencies.
Considering this, avoid using arrays in Ditto and opt for a map instead, which effectively functions as a set, or more commonly known as an associative array.
When managing data that requires unique identifiers and relationships, instead of using an array to encode your data, use a map with unique string keys and object values instead.
For example, instead of representing an array of cars, where each element represents a car:
Implement a map instead:
With conflict-free replicated data type (CRDT) technology, each document is represented as a map.
A map is a JSON-like object that serves as the basis of each Ditto document and is structured as a collection of field-value pairs:
- To represent simple values in a map, use any primitive data type, such as a string, boolean, number, and so on.
- To represent a highly-complex data structure in a map, use register, counter, array, or embed another map. Embedding a map within another map establishes an additional hierarchy.
The following snippet demonstrates a Ditto document with an embedded map:
Following are the key attributes of the map type:
- A map is represented in the document as a tree-like structure that establishes a hierarchical, parent-child relationship between the dataset in the document.
- Use maps in scenarios where you want to create a list of items and update that list over time.
- If one peer creates a field within a document as an Array and another peer an object, such as a map or register, the values do not merge.
To create a single map represented as a JSON-like root object in the document, use the following data model:
If you need to represent and organize data in a hierarchical structure, you can embed a map within another map to establish a parent-child relationship within a document:
Each document in Ditto is inherently a map object at its root.
That is, when you use the Upsert API to create a new document, a top-level map Ditto automatically generates a CRDT map at the root of the document. For instance, the parent field in the following snippet is actually the document's root map. For more information, see Platform Manual > Document Model.
When updating and adding fields to a map embedded within another map, use keypath indexing. Also referred to as dot syntax, a keypath index concisely specifies the fields to update when working with a map embedded within another map.
In a keypath index, each dot (.) represents a level in the map hierarchy. For example, friends.foo indicates that the foo field is a child of the parent friends map, as demonstrated in the following snippet.
By calling the .remove() API method, as follows, you omit only the foo field from the friends map within the document, while the other fields within the friends map remain unaffected:
The following snippet results in all of the values in the friends map being replaced with the new object:({ "beep": "boop" }):
Removing a Map
Since CRDT map values merge with the existing document, simply omitting them from the CRDT map does not remove them.
Instead, the CRDT map creates an operation for that field and subsequently the existing fields remain unchanged.
By calling the .remove() API method, as follows, you omit only the foo field from the friends map within the document, while the other fields within the friends map remain unaffected:
When you want to clear the entire map structure embedded in the document, call the .set() method.
For example, the following snippet illustrates the process of removing the friends map through the .set() method, making it empty.
An issue unique to maps is the possibility for two offline peers to create a new document, in which one peer represents the field as an object (map), while the other peer represents the field as an array.
The following snippets illustrate a scenario of a type-level conflict unique to maps. Peer A creates the following new document:
While at the same time Peer B creates the following new document:
Because peer A and peer B use divergent data structures, combining an array with an object (map) is impossible. Rather than adhering to the default "last updated type" win principle, which might trigger a ping-pong alteration of types between the peers, retain both values for the address property.
A ping-pong alteration of types occur when distinct peers repeatedly modify the data type of a specific field in response to each other's updates, leading to a forever loop of back‑and‑forth behavior.
To avoid the ping-pong effect when conflicts between data types occur, retain both the array and the map object representations of the field in the document. For example, in the previous scenario, you keep both versions of the conflicting representation of the address field in the document.
Retaining both the array and map representations:
- Prevents back-and-forth, ping-pong behavior
- Ensures that there is no loss of data
- Provides flexibility, allowing you to choose the data type that is most appropriate for encoding data in JSON based on your specific requirements and use case
The best approach to handle conflicts that result from two peers making concurrent offline edits and then later rejoining online depends on your specific requirements and use case.
Following is an overview of best approaches for handling concurrency conflicts:
- Resolving concurrency conflicts — If you want to give priority to the "latest" change, use a register. Ditto's register type use Last-Write-Wins semantics so the value written last always becomes the current value.
- Auditing concurrency conflicts — If you want to keep track of the changes made by different peers over time, use the map type to model your list of operations. Each write operation is independently tracked as a field-value pair, with the field representing the unique identifier and the value storing only the specific changes made by a given peer.
- Prompting end users to choose — If you want your end users to resolve concurrency conflicts instead of Ditto, use the map type inside of your document and prompt end users to select the value to replicate.
Imagine a scenario in which two Ditto stores, peer A and peer B, have the following document:
Peer A calls the .upsert() method to change the field-value color:red to color:blue:
While at the same time peer B calls the .update() method to change the value of the mileage field:
When the changes replicate across the distributed peers, both changes merge resulting in both peer A and peer B Ditto stores having the mileage increment of 200 and the color change to blue:
The counter CRDT is a special type intended for use cases where multiple peers need to increment (or decrement) field values at the same time while maintaining consistency; for example, in scenarios such as inventory management or voting.
There are two methods of modifying an existing counter value:
The counter type is useful in very select scenarios. Consider using the .set() method to define new values on a map to track increments, while also storing additional metadata.
- Increment — Increases the counter value by the specific number you want to add to the counter value.
- Decrement — Decreases the counter value by the specific negative value that you want to subtract from the counter.
To create a counter, use the upsert method to ensure the document exists in Ditto, and then use the .update() method to increment the counter and ensure that the correct and accurate counter value is maintained across all peers during replication:
Do not use the upsert method to increment a counter. Only modify the counter field within an update clause, as follows:
Define the structure of the counter document.
Using the upsert, ensure that the document exists before interacting with it.
If it does not exist, Ditto automatically creates the document and initializes the counter with a set value of 0.
To increment the counter value, call the .update() method and pass the value you want to increment the counter by.
A counter is capable of recalling its previous increment and decrement calls; however, with a counter there is no concept of time. This is true even if you change the type of a counter to another field, such as number, and then back into a counter again.
Given this constraint, avoid resetting counters. Instead, create new documents or collections that have meaning and encapsulate a period of time.
If it is essential that you reset the counter to zero (0), make sure to remove the counter from the document before attempting to reset the counter.
Note that if another peer increments the counter at the same time that another peer removes the counter, you may encounter unexpected behavior.
Syncing large documents can significantly impact sync performance:
Caution is advised when handling very large binary data, a deeply-embedded document, or a very large document.
Carefully consider using attachments instead of storing the data directly within a document object.
With the attachment CRDT you can associate very large amounts of binary data, such as an image, video, and so on, with a document and replicate across peers without conflict.
If you have very large amounts of binary data, such as a high-resolution image, video, or some other file above 50 megapixels, or if you have a deeply-embedded or a large document object, use an attachment instead of a regular document.
Unlike documents, attachments store data outside of the Ditto store running locally in the end-user environment and must be explicitly fetched to replicate across distributed peers.
For more information about Ditto documents, see the Platform Manual > Document Model.
Maintain a strong reference to attachmentFetcher for the entirety of the asynchronous fetch operation by following these guidelines:
- Preserve the attachmentFetcher as a globally accessible instance
- Prevent the fetch operation from silently aborting
The following snippet demonstrates a use case for leveraging the attachment CRDT, as well as the step-by-step process for creating and fetching the attachment:
If developing in Swift, for a tutorial on how to work with an attachment in a chat app, see Attachments: Chat App.
The following snippet demonstrates the process for creating, associating, and fetching anattachment.
- Define a collection named 'foo'.
- Using Base64-encoded image data and metadata, create an attachment object .
- Upsert a document with an attachment in the collection.
- Later, retrieve the document by _id and fetch the attachment using an attachmentFetcher.
The following table provides an overview of the CRDTs and associated behavior for a given operation:
For more information, see CRUD Operations.
Operation | Description |
set register | Sets the value for a given field in the document. |
set map | Sets value for a given field in the map. |
remove register | Removes a value for a given field in the document. |
remove map | Removes a value for a given key in the map structure. |
replace with counter | Converts a number value for a given field into a counter. |
increment counter | Unlike a number, increments the counter by the given positive integer value. |
decrement counter | Unlike a number, decrements the counter by the given negative integer value. |