Removing Documents
This article provides information on removing documents within Ditto.
When using DELETE
ensure that devices running Ditto Edge SDK are on version 4.10.1 and later.
Support for Ditto Cloud Server is rolling out over the next few weeks.
For more information on how to use delete and manage data reach out to Ditto’s Customer Support
In Ditto, there are two key concepts for removing documents:
- Document Deletions using the
DELETE
keyword in DQL - Document Evictions using the
EVICT
keyword in DQL
Deletions and evictions are designed to work together to give you flexibility in managing data within your Ditto application.
- Use
DELETE
when you want to permanently remove a document from a collection. Once a document is deleted withDELETE
, it is considered gone from the user’s perspective and cannot be recovered. - Use
EVICT
when you want to remove data only from the local device. Evictions are useful for scenarios where an Edge SDK only needs to store and sync part of the database. Evicting a document doesn’t delete it from the system — it just frees up local storage.
In general, for applications with both Ditto Servers and Edge SDKs, the typical approach is:
- Use
DELETE
to manage permanent deletion on Ditto Server. - Use
EVICT
to manage data on a Ditto Edge SDK.
Here’s an example to show how these might work together:
Imagine a system where Devices with Ditto Edge SDK only need to keep documents for 3 days due to storage limits, while Ditto Server store documents for 90 days to meet data retention requirements. In this case, the Ditto Edge SDK would evict any document older than 3 days, removing them from local storage without deleting them from the system. After 90 days, the user would use the Ditto Server HTTP API to permanently delete the documents.
Deleting Data
Auto-eviction of expired tombstones is enabled on SDK version 4.10.1 and later.
The DELETE
keyword in DQL permanently removes one or more specified documents from the Ditto system. Once deleted these
document are non-recoverable.
Example DELETE
Deleting Multiple Documents in a Collection
All documents specified in the DELETE
condition will be permanently deleted. The ids of deleted document can be referenced
using the mutatedDocumentIDs
method on the result
.
The following example permanently deletes all blue
cars stored in the cars
collection:
Design Considerations When Using DELETE
DELETE
will permanently remove the selected documents from the system.- Document tombstones are an internal system concept and are not accessible to the user.
- Document tombstones are only shared with devices that have seen the document before its deletion event. Peers will not receive tombstones for documents that they have never known about.
- Document tombstones consist of the document id and fields of the document at the time of deletion. All values are removed.
- Tombstone reaping and removal can cause performance issues for large data sets. Customers with large numbers of deleted
documents should specifying a
reaper_preferred_hour
and enableenable_reaper_preferred_hour_scheduling
to ensure minimal business impact. - If Ditto is not running on a Edge SDK device during the
reaper_preferred_hour
expired tombstones will not be evicted.
Deleting Data Through the Ditto Server
Auto‑eviction for Ditto Cloud Server is rolling out over the next few weeks. Contact Ditto’s Customer Support to schedule an upgrade.
Deleting data on the cloud server can be performed by executing a DQL DELETE
query through the HTTP API.
For information on how to execute a DQL query through the HTTP API see HTTP POST API > Execute a DQL query
Considerations when Deleting through Ditto Server
- Every
DELETE
statement performed on the Cloud/Server through the HTTP API are executed immediately as a single atomic operation. Removing a larger number of documents (approx. 50,000 or more) at once has the risk of causing performance impact to the larger system. Performance impacts includes slow sync times for all connected SDK Edge devices. To minimize potential impact it’s recommended to delete documents in batches of 30,000 or less. This can be done using theLIMIT
keyword.
- Deleted documents will be permanently removed from the Cloud/Server after 30 days. To configure this contact Ditto’s Customer Support
- Tombstone reaping (removal) is run every hour on the Ditto Cloud/Server. To configure this contact Ditto’s Customer Support
For more information on how to best design data removal policy contact Ditto’s Customer Support
Advanced: Configuring Tombstone Removal
Auto-eviction of expired tombstones is enabled on SDK version 4.10.0 and later.
To ensure all peers in the system are aware of a deleted document the Ditto system keeps a compressed version of the document, called a document tombstone. A document tombstone indicates the document is deleted and is used to shared with other peers in the mesh that a document is deleted.
When a document is deleted all of the mutable document information is removed. The remaining document information includes
the document’s ID (_id
) and metadata about the document including the time the document was deleted (deletion timestamp).
Document tombstones have a deletion timestamp. Once the timestamp goes beyond the set time to live (TTL) the document tombstone is considered expired and will be removed in a process called the tombstone reaping. Tombstone reaping is where the Ditto system scans for and removes expired tombstones. Tombstone reaping runs once daily by default.
Document tombstones by default are retained on Edge SDK devices for 7 days before being automatically removed. The Ditto system keeps document tombstones around ensures all peers are aware of the deleted document. Once the document tombstone is removed from the system there will be no history of the document’s existence.
Configuring Document Tombstone Retention on the Edge SDK
Ditto offers a set of 5 system properties for configuring tombstone retention on a Edge SDK. All properties can be
enabled using the ALTER SYSTEM
command on a Edge SDK device. For help configuring a retention policy that’s best
for your use case reach out to Ditto’s Customer Support
Default configuration:
- Tombstone removal is
disabled
. To enable set system parameterTOMBSTONE_TTL_ENABLED
totrue
. - Scanning for expired tombstones, also known as tombstone reaping, happens shortly after the instance is created.
- Tombstone reaping occurs once a day from the initial reaping.
- Tombstones are marked as expired 7 days from the time of their deletion event. An expired tombstone will be removed in the following reaping.
- Tombstone reaping
Evicting Data
The EVICT
method, once invoked, immediately removes the specified document(s) from the local Ditto store, making it inaccessible by local queries.
For complete DQL syntax, see EVICT.
Although the document you evicted is removed from the local Ditto store, the document stored within remote Ditto stores persists.
To prevent the evicted data from reappearing on the screen in a single flicker, make sure to stop subscriptions before you call EVICT; otherwise, the subscription remains active and even if you reset the data in your end-user environment, the evicted data momentarily reappears.
The EVICT
keyword in DQL immediately removes one or more specified documents from the local Ditto store, making it inaccessible by local queries.
Example EVICT
Evicting Multiple Documents in a Collection
All documents specified in the EVICT
condition, will be evicted. The ids of evicted document can be referenced using the mutatedDocumentIDs
method on the result
.
The following example evicts all blue
cars stored in the cars
collection from the local Ditto store:
Using Evict with Sync Subscriptions
Because evicting a document only removes it from the local device but not the large system other peers, including Ditto Server, will still have the document and if not managed properly the document will return with the result of a sync subscription.
For example, if you have an active subscription for fetching 'blue'
cars and you subsequently evict a document with the ID '123456'
that matches the replication
query, connected peers will notice that you are missing document '123456'
that matches your subscription for 'blue'
cars and send it back to you. To prevent this
from happening you need to ensure that any active sync subscriptions don’t contain documents you plan on evicting from the device.
Coordinating Evictions
If you want to indicate that a batch of documents are irrelevant and, although they are to be retained, should not sync across peers, add the isSafeToEvict field to the document property tree. Then, use a method to alert clients to flag any documents they consider irrelevant.
To ensure that peers continue replicating documents that are considered relevant, incorporate isSafeToEvict == false into their sync subscription query.
This approach restricts replication only to documents that peers mark as ‘true’ for isSafeToEvict. Once flagged, the peers clear irrelevant documents from their caches, all the while normal transactional operations continues without interruption.
Advanced Design Pattern: Soft-Delete Pattern
If you need a data recovery option, instead of permanently removing the data from the local Ditto store like EVICT, opt for a soft-delete pattern.
A soft-delete pattern is a way to flag data as inactive while retaining it for various requirements, such as archival evidence, reference integrity, prevention of potential data loss due to end-user error, and so on.
Adding a Soft-Delete Flag
To add a soft-delete pattern, set the isArchived
field value to true
:
Querying Non-Archived Documents
To query to monitor documents that are NOT
* *archived, establish a live query where isArchived
is set to false
, and then construct your live query callback.
It’s likely that the isArchived
field is set lazily (i.e. has no value until it is true
), so you can use the coalesce()
function to automatically return false
if the value is unset.
The following code demonstrates searching for documents that are unarchived:
Removing Soft-Delete Flag
To remove the flag and reactivate the document, set the isArchived
field to false
: