In Ditto, there are two key concepts for removing documents:
Document Deletions using the DELETE keyword in DQL
Document Evictions using the EVICT keyword in DQL
Deletions and evictions are designed to work together to give you flexibility in managing data within your Ditto application.
Use DELETE when you want to permanently remove a document from a collection. Once a document is deleted with DELETE,
it is considered gone from the user’s perspective and cannot be recovered.
Use EVICT when you want to remove data only from the local device. Evictions are useful for scenarios where an Edge SDK
only needs to store and sync part of the database. Evicting a document doesn’t delete it from the system — it just frees up local storage.
In general, for applications with both Ditto Servers and Edge SDKs, the typical approach is:
Use DELETE to manage permanent deletion on Ditto Server.
Use EVICT to manage data on a Ditto Edge SDK.
Here’s an example to show how these might work together:
Imagine a system where Devices with Ditto Edge SDK only need to keep documents for 3 days due to storage limits, while Ditto Server store documents for 90
days to meet data retention requirements. In this case, the Ditto Edge SDK would evict any document older than 3 days, removing them
from local storage without deleting them from the system. After 90 days, the user would use the Ditto Server HTTP API to permanently delete the
documents.
All documents specified in the DELETE condition will be permanently deleted. The ids of deleted document can be referenced
using the mutatedDocumentIDs method on the result.
The following example permanently deletes all blue cars stored in the cars collection:
Copy
Ask AI
let result = await ditto.store.execute( "DELETE FROM cars WHERE color = 'blue'");result.mutatedDocumentIDs.forEach() { print($0) }
DELETE will permanently remove the selected documents from the system.
Document tombstones are an internal system concept and are not accessible to the user.
Document tombstones are only shared with devices that have seen the document before its deletion event. Peers will
not receive tombstones for documents that they have never known about.
Document tombstones consist of the document id and fields of the document at the time of deletion. All values are removed.
Tombstone reaping and removal can cause performance issues for large data sets. Customers with large numbers of deleted
documents should specifying a reaper_preferred_hour and enable enable_reaper_preferred_hour_scheduling to ensure
minimal business impact.
If Ditto is not running on a Edge SDK device during the reaper_preferred_hour expired tombstones will not be evicted.
Every DELETE statement performed on the Cloud/Server through the HTTP API are executed immediately as a single atomic operation. Removing a larger number
of documents (approx. 50,000 or more) at once has the risk of causing performance impact to the larger system. Performance impacts includes
slow sync times for all connected SDK Edge devices. To minimize potential impact it’s recommended to delete documents in batches of 30,000 or less.
This can be done using the LIMIT keyword.
Copy
Ask AI
DELETE FROM <collection_name> WHERE <condition> LIMIT 30000
Deleted documents will be permanently removed from the Cloud/Server after 30 days. To configure this contact Ditto’s Customer Support
Tombstone reaping (removal) is run every hour on the Ditto Cloud/Server. To configure this contact Ditto’s Customer Support
Auto-eviction of expired tombstones is enabled on SDK version 4.10.0 and later.
To ensure all peers in the system are aware of a deleted document the Ditto system keeps a compressed version of the
document, called a document tombstone. A document tombstone indicates the document is deleted and is used to
shared with other peers in the mesh that a document is deleted.
When a document is deleted all of the mutable document information is removed. The remaining document information includes
the document’s ID (_id) and metadata about the document including the time the document was deleted (deletion timestamp).
Document tombstones have a deletion timestamp. Once the timestamp goes beyond the set time to live (TTL) the document
tombstone is considered expired and will be removed in a process called the tombstone reaping. Tombstone
reaping is where the Ditto system scans for and removes expired tombstones. Tombstone reaping runs once daily
by default.
Document tombstones by default are retained on Edge SDK devices for 7 days before being automatically removed.
The Ditto system keeps document tombstones around ensures all peers are aware of the deleted document. Once the document
tombstone is removed from the system there will be no history of the document’s existence.
Configuring Document Tombstone Retention on the Edge SDK
Ditto offers a set of 5 system properties for configuring tombstone retention on a Edge SDK. All properties can be
enabled using the ALTER SYSTEM command on a Edge SDK device. For help configuring a retention policy that’s best
for your use case reach out to Ditto’s Customer Support
Default configuration:
Tombstone removal is disabled. To enable set system parameter TOMBSTONE_TTL_ENABLED to true.
Scanning for expired tombstones, also known as tombstone reaping, happens shortly after the instance is created.
Tombstone reaping occurs once a day from the initial reaping.
Tombstones are marked as expired 7 days from the time of their deletion event. An expired tombstone will be removed
in the following reaping.
Enabling this parameter on a running Ditto instance will not cause the peer to immediately check for expired tombstones; the peer
will wait until the next time the tombstone reaping is schedule to run.
Example: If the peer is configured to reap tombstones once a day around 9am, and TOMBSTONE_TTL_ENABLED is set to true at
1pm, the peer will wait until the next morning at 9am to run the tombstone reaping.
The threshold age in hours after which a document tombstone will be considered expired on the local peer. Once a document tombstone
is expired it will be removed in the next tombstone reaping.
Setting the system parameter
Copy
Ask AI
-- Setting TOMBSTONE_TTL_HOURS to 5 daysALTER SYSTEM SET USER_COLLECTION_SYNC_SCOPES = { TOMBSTONE_TTL_HOURS: 120 }
The timestamp used to determine the tombstone’s age is written by the device that deleted the document. If a peer with an
inaccurate clock deletes document, that document may be removed from the local peer earlier or later than expected.
Setting the TTL to a very low number may cause the deletion action to fail. This is because the device must share the document
tombstone with other peers in the mesh so they know the document has been deleted.
Setting the TTL to a very high number can cause performance issues due to the increased storage space requirement. Tombstones
are quite small and don’t take up much storage space, so the default of one week shouldn’t pose any problems for typical use cases.
If Edge SDK devices are expected to be offline for longer than a week then a higher TTL value may be needed to ensure the deletion
event is being propagated through the system.
Never set the Edge SDK TTL to a number larger than the Ditto Server TTL. By default Edge SDK TTL is 7 days and Ditto Server TTL is 30 days.
Because Ditto Server will always sync documents, setting this incorrectly will cause the tombstones to be sent back to Ditto Server
after they’ve been removed, leading to increased resource consumption as tombstones are repeatedly synced back to Ditto Server and
removed again.
By default we only check for expired tombstones once a day, but this parameter allows us to check less frequently. This could provide
small performance benefits on resource constrained devices.
If you want to have tombstones expire more slowly, use the TOMBSTONE_TTL_HOURS parameter.
When enabled, Ditto will attempt to schedule the tombstone reaping process to occur during the hour specified in the
REAPER_PREFERRED_HOUR parameter.
Setting the system parameter
Due to a known issue, changes to this parameter at runtime are not currently respected. The desired value must be set before starting Ditto.
This will be resolved in an upcoming release.
This can be done by setting an system environment variable with the same name as the system parameter ENABLE_REAPER_PREFERRED_HOUR_SCHEDULING.
For assistance reach out to Ditto’s Customer Support.
Because the reaping event can have performance impact for large databases, this setting, in use with REAPER_PREFERRED_HOUR,
can be used to schedule the reaping to occur during off-hours to minimize business impact.
This is disabled by default. When disabled, we default to running tombstone reaping shortly after startup, and then repeat
every DAYS_BETWEEN_REAPING days, regardless of the time of day.
Enabling this setting does not specify that tombstone reaping will be triggered at a particular point during that hour. It indicates that we will
attempt to reap tombstones at some point during that hour, assuming Ditto is running at that time.
If this parameter is enable, DAYS_BETWEEN_REAPING will continue to be used to determine tombstone reaping cycles.
If this parameter is enable, and Ditto is not running during the specified hour in REAPER_PREFERRED_HOUR, the tombstone reaping process will not run that day.
If this happens for many days in a row it could lead to performance issues due to increased storage space usage.
WASM-based platform currently don’t support this setting. This includes JavaScript Web and Flutter Web SDKs.
When ENABLE_REAPER_PREFERRED_HOUR_SCHEDULING is true, Ditto will try to schedule the tombstone reaping process to occur during the hour
specified in the REAPER_PREFERRED_HOUR parameter.
Setting the system parameter
Due to a known issue, changes to this parameter at runtime are not currently respected. The desired value must be set before starting Ditto.
This will be resolved in an upcoming release.
This can be done by setting an system environment variable with the same name as the system parameter REAPER_PREFERRED_HOUR.
For assistance reach out to Ditto’s Customer Support.
This setting is only active if ENABLE_REAPER_PREFERRED_HOUR_SCHEDULING is set to true.
Due to a known issue, changes to this parameter at runtime are not currently respected. The desired value must be set before starting Ditto.
This will be resolved in an upcoming release.
This can be done by setting an system environment variable with the same name as the system parameter REAPER_PREFERRED_HOUR.
For assistance reach out to Ditto’s Customer Support.
Because the reaping event can have performance impact for large databases, this setting, in use with REAPER_PREFERRED_HOUR,
can be used to schedule the reaping to occur during off-hours to minimize business impact.
This setting does not specify that tombstone reaping will be triggered at a particular point during that hour; only that we will attempt
to reap tombstones at some point during that hour, assuming Ditto is running
If Ditto is not running during the specified hour, the tombstone reaping process will not run that day. If this happens for many days
in a row it could lead to performance issues due to increased storage space usage.
Although the document you evicted is removed from the local Ditto store, the document stored within remote Ditto stores persists.
To prevent the evicted data from immediately reappearing on the screen, make sure to stop subscriptions before you call EVICT; otherwise, the subscription remains active and even if you reset the data in your end-user environment, the evicted data reappears as soon as the subscription sees it missing.
The EVICT keyword in DQL immediately removes one or more specified documents from the local Ditto store, making it inaccessible by local queries.
All documents specified in the EVICT condition, will be evicted. The ids of evicted document can be referenced using the mutatedDocumentIDs method on the result.
The following example evicts all blue cars stored in the cars collection from the local Ditto store:
Copy
Ask AI
let result = await ditto.store.execute( "EVICT FROM cars WHERE color = 'blue'");result.mutatedDocumentIDs.forEach() { print($0) }
Because evicting a document only removes it from the local device but not the large system other peers, including Ditto Server, will still have the document and if
not managed properly the document will return with the result of a sync subscription.
For example, if you have an active subscription for fetching 'blue' cars and you subsequently evict a document with the ID '123456' that matches the replication
query, connected peers will notice that you are missing document '123456' that matches your subscription for 'blue' cars and send it back to you. To prevent this
from happening you need to ensure that any active sync subscriptions don’t contain documents you plan on evicting from the device.
If you want to indicate that a batch of documents are irrelevant and, although they are to be retained, should not sync across peers, add the isSafeToEvict field to the document property tree. Then, use a method to alert clients to flag any documents they consider irrelevant.
To ensure that peers continue replicating documents that are considered relevant, incorporate isSafeToEvict == false into their sync subscription query.
This approach restricts replication only to documents that peers mark as ‘true’ for isSafeToEvict. Once flagged, the peers clear irrelevant documents from their caches, all the while normal transactional operations continues without interruption.
Copy
Ask AI
await ditto.store.execute("EVICT FROM cars WHERE isSafeToEvict = true");
If you need a data recovery option, instead of permanently removing the data from the local Ditto store like EVICT, opt for a soft-delete pattern.
A soft-delete pattern is a way to flag data as inactive while retaining it for various requirements, such as archival evidence, reference integrity, prevention of potential data loss due to end-user error, and so on.
To query to monitor documents that are NOT* *archived, establish a live query where isArchived is set to false, and then construct your live query callback.
It’s likely that the isArchived field is set lazily (i.e. has no value until it is true), so you can use the coalesce() function to automatically return false if the value is unset.
The following code demonstrates searching for documents that are unarchived:
Copy
Ask AI
let result = await ditto.store.execute(""" SELECT * FROM cars WHERE coalesce(isArchived, false) = false """)