Skip to main content
This feature is available in SDK version 4.12.0 and later. See restrictions for current limitations.
From version 4.13.0, DQL supports union and intersect scans for queries with OR, IN, and AND operators. This allows the query optimizer to use multiple indexes simultaneously in a single query. For example, a query like WHERE status = ‘active’ OR priority = ‘high’ can leverage separate indexes on both status and priority fields, combining results through a union scan.
DQL indexing allows you to create simple indexes on the Small Peer to accelerate searches, support joins, and enhance the overall efficiency of data retrieval.

Performance and Query Selectivity

Index performance is inversely related to query selectivity—the percentage of documents returned by your query. Highly selective queries (returning a small percentage of documents) see the most dramatic improvements. For example, a query that returns only 10% of a collection can be approximately 90% faster with an appropriate index, as the index allows skipping the other 90% of documents entirely. Less selective queries benefit less from indexes since most documents still need to be examined.

Index Operations Overview

DQL supports the following index operations:
OperationDescription
CREATE INDEXCreates a simple on a specified field
DROP INDEXRemoves an existing index
USE INDEXForce a query to use a specific index
USE DIRECTIVESAdvanced control over index usage and query optimization
SELECT from system:indexesLists all indexes in the database
Index operations can only be performed through the execute SDK operation. Using any of these commands through registerObserver or registerSubscription will fail with an “invalid query not supported” response.

Creating Indexes

Simple Index

Create an index on a single field to improve query performance for that field:
DQL
-- Standard syntax
CREATE INDEX index_name ON collection_name (field1)

-- Alternative syntax
CREATE INDEX collection_name.index_name (field1)

-- With IF NOT EXISTS clause (prevents errors on duplicate index names)
CREATE INDEX IF NOT EXISTS index_name ON collection_name (field1)
For example, create an index on the color field in the cars collection:
DQL
CREATE INDEX color_idx ON cars (color)
-- or
CREATE INDEX cars.color_idx (color)
-- or with IF NOT EXISTS to allow repeated execution
CREATE INDEX IF NOT EXISTS color_idx ON cars (color)
The IF NOT EXISTS clause allows the statement to execute successfully even if an index with the same name already exists on the collection. This is useful for initialization scripts that may run multiple times.
The IF NOT EXISTS clause only checks if an index with the same name exists - it does not verify the field definition. If an index with the same name already exists (even on a different field), the CREATE INDEX IF NOT EXISTS will succeed without creating a new index or modifying the existing one. To change an index definition, you must explicitly DROP the existing index first before creating the new one.

Index Creation Examples

The CREATE INDEX operation returns an empty result set (no items) upon successful completion. Successful execution without errors indicates that the index was created successfully - if there are any issues (such as duplicate index names or invalid syntax), an error will be thrown. To verify that your index was created successfully, you can query the system:indexes collection.
// Create a simple index
let result = await ditto.store.execute(
  query: "CREATE INDEX color_idx ON cars (color)")
// result.items will be empty - this is expected

// Create with IF NOT EXISTS - safe for repeated execution
let result2 = await ditto.store.execute(
  query: "CREATE INDEX IF NOT EXISTS color_idx ON cars (color)")
// No error even if index already exists

// Verify the index was created
let indexes = await ditto.store.execute(
  query: "SELECT * FROM system:indexes WHERE _id = :indexName",
  arguments: ["indexName": "cars.color_idx"])
// indexes.items will contain the index details

Nested Fields

You can create indexes on nested fields by using dot notation. You can create indexes on registers nested within a map, as long as the value at the end of the path is a scalar.
DQL
CREATE INDEX nested_idx ON cars (properties.engine.type)

LIKE

Indexes can also accelerate LIKE queries when searching for prefix patterns:
DQL
-- Create an index on the name field
CREATE INDEX name_idx ON users (name)

-- This query can use the name_idx for efficient prefix matching
SELECT * FROM users WHERE name LIKE 'John%'
Indexes are particularly effective for LIKE patterns that start with a fixed prefix (e.g., ‘prefix%’). Patterns starting with wildcards (e.g., ‘%suffix’) cannot leverage indexes effectively.

Automatic Index Selection

By default, DQL automatically chooses the most appropriate index for your queries based on the WHERE clause conditions and ORDER BY expressions:
DQL
SELECT * FROM cars WHERE color = 'blue'
-- Automatically uses color_idx if it exists
When a query contains both WHERE and ORDER BY clauses, DQL considers both when selecting the optimal index. An index that satisfies both the filter condition and sort order will be preferred over one that only satisfies the filter.

Operating on Indexes

DQL supports additional options when creating indexes:

Viewing Indexes

You can discover all indexes in your database by querying the special system:indexes collection:
DQL
SELECT * FROM system:indexes
Each document in the system:indexes collection follows this schema:
{
  "_id": "index_name",
  "collection": "collection_name",
  "fields": ["field1", "field2"]
}
Example
{
  "_id":"tasks.tasks_id",
  "collection":"tasks",
  "fields":["`_id`"]
}
// List all indexes
let result = await ditto.store.execute(
  query: "SELECT * FROM system:indexes")

// List indexes for a specific collection
let result = await ditto.store.execute(
  query: "SELECT * FROM system:indexes WHERE collection = :collectionName",
  arguments: ["collectionName": "cars"])

Dropping Indexes

Remove an index that is no longer needed:
DQL
-- Standard syntax
DROP INDEX index_name ON collection_name

-- Alternative syntax
DROP INDEX collection_name.index_name

-- With IF EXISTS clause (prevents errors if index doesn't exist)
DROP INDEX IF EXISTS index_name ON collection_name
For example, to drop the color_idx index from the cars collection:
DQL
DROP INDEX color_idx ON cars
-- or
DROP INDEX cars.color_idx
-- or with IF EXISTS to allow repeated execution
DROP INDEX IF EXISTS color_idx ON cars
The IF EXISTS clause allows the statement to execute successfully even if the index doesn’t exist. This is useful for cleanup scripts that may run multiple times or when you’re unsure if an index exists.
Without the IF EXISTS clause, attempting to drop a non-existent index will raise an error.
// Standard syntax
await ditto.store.execute(query: "DROP INDEX color_idx ON cars")

// Alternative syntax
await ditto.store.execute(query: "DROP INDEX cars.color_idx")

// With IF EXISTS - safe for repeated execution
await ditto.store.execute(query: "DROP INDEX IF EXISTS color_idx ON cars")
// No error even if index doesn't exist

Controlling Index Usage

DQL provides several ways to control index selection in your queries.
Using index directives on a sync subscription will result in an invalid query error.

USE INDEX Syntax

The USE INDEX clause is the simple way to force a query to use a specific index. It appears after the collection name in SELECT statements:
DQL
SELECT * FROM collection_name USE INDEX 'index_name' WHERE ...
DQL
-- Force use of a specific index
SELECT * FROM cars USE INDEX 'color_idx' WHERE color = 'blue'
This syntax is straightforward and recommended for most use cases where you need to specify which index to use for a query.

When to Use USE INDEX

There are scenarios where you might want to force a specific index rather than letting DQL auto-select.
  1. Forcing a specific index when multiple could apply
DQL
-- You have indexes: customer_idx and total_idx
-- Force the customer index when you know it's more selective
SELECT * FROM orders USE INDEX 'customer_idx'
WHERE customer_id = 'abc123' AND total > 1000
-- Without forcing, DQL might choose total_idx if it estimates incorrectly
  1. Preferring sort order optimization
DQL
-- You have indexes: date_idx and status_idx
-- Force the date index when sort performance is critical
SELECT * FROM events USE INDEX 'date_idx'
WHERE status = 'active' ORDER BY date DESC LIMIT 10
  1. Working around index statistics
DQL
-- Force an index when you know the data distribution better than the optimizer
-- e.g., most cars are 'available' but you're querying a rare status
SELECT * FROM cars USE INDEX 'status_idx'
WHERE status = 'maintenance' -- Only 0.1% of records
  1. Testing and performance comparison
DQL
-- Compare performance between different indexes
SELECT * FROM products USE INDEX 'category_idx' WHERE category = 'electronics'
-- vs
SELECT * FROM products USE INDEX 'category_brand_idx' WHERE category = 'electronics'

Advanced: USE DIRECTIVES Syntax

For more advanced scenarios, DQL also supports the USE DIRECTIVES syntax which allows fine-grained control over query optimization using JSON directives:
DQL
SELECT * FROM collection_name USE DIRECTIVES '<json-directives>' WHERE ...
DQL
-- Force use of a specific index (equivalent to USE INDEX)
SELECT * FROM cars USE DIRECTIVES '{"#index":"color_idx"}'
WHERE color = 'blue'

-- Force use of multiple indexes for intersection (v4.13+)
SELECT * FROM movies USE DIRECTIVES '{"#index":["movies_title_idx", "movies_rated_idx"]}'
WHERE title LIKE 'Mind%' AND (rated = 'PG' OR rated = 'G')

-- Force collection scan (skip all indexes)
SELECT * FROM cars USE DIRECTIVES '{"#index":null}'
WHERE color = 'blue'

-- Let DQL auto-select from available indexes
SELECT * FROM cars USE DIRECTIVES '{"#index":[]}'
WHERE color = 'blue'

-- Prefer ordering optimization
SELECT * FROM cars USE DIRECTIVES '{"#prefer_order":true}'
WHERE year > 2020 ORDER BY price DESC
// Using USE INDEX syntax
let result = await ditto.store.execute(
  query: "SELECT * FROM cars USE INDEX 'color_idx' WHERE color = :color",
  arguments: ["color": "blue"])

// Using USE DIRECTIVES syntax (advanced)
let result2 = await ditto.store.execute(
  query: "SELECT * FROM cars USE DIRECTIVES '{\"#index\":\"color_idx\"}' WHERE color = :color",
  arguments: ["color": "blue"])

// Using comment directive (advanced)
let result3 = await ditto.store.execute(
  query: "/*+ {\"cars\":{\"#index\":\"color_idx\"}} */ SELECT * FROM cars WHERE color = :color",
  arguments: ["color": "blue"])

Comment Directives

Comment directives provide the same functionality using special comments:
DQL
/*+ <json-directives> */ SELECT ...
-- or
--+ <json-directives>
SELECT ...
The JSON structure for comment directives supports both collection-specific and global settings:
DQL
-- Collection-specific directive
/*+ {"cars":{"#index":"color_idx"}} */
SELECT * FROM cars WHERE color = 'blue'

-- Global directive affecting all collections
/*+ {"#prefer_order":true} */
SELECT * FROM cars WHERE year > 2020 ORDER BY price DESC

-- Multiple directives
/*+ {
  "cars": {
    "#index": "color_idx",
    "#prefer_order": true
  },
  "#disable_index_scan": false
} */
SELECT * FROM cars WHERE color = 'blue' ORDER BY year

Directive Reference

Collection-specific directives always override global directives. Unrecognized directives are silently ignored, but the JSON must be valid or a parser error will occur.
DirectiveTypeDefaultDescription
#indexstring, array, or nullauto-selectSpecifies which index to use
#prefer_orderbooleanfalsePrefer indexes that optimize ORDER BY over WHERE
#disable_index_scanbooleanfalseDisable all index scans (global only)
#max_scansint16The maximum number of scans to permit in a statement
#intersectsint4The maximum number of intersect scans to permit in a statement branch
The #index directive accepts different value types.
ValueBehavior
"index_name"Use the specified index if available
["index1", "index2", ...]Use specified indexes for intersection (v4.13+)
null or ""Force collection scan (no indexes)
[]Auto-select from available indexes
Starting in v4.13, you can pass an array of index names to force the use of multiple indexes simultaneously through intersection scans. The indexes will be applied in the order specified for AND conditions. For example: {"#index":["movies_title_idx", "movies_rated_idx", "movies_year_idx"]}.
Given multiple directives, the last instance of any directive applies. For example, the following will result in ix2 being used and not ix1.
SELECT ... FROM cars
USE DIRECTIVES '{"#index":"ix1"}'
WHERE ... /*+{"cars":{"#index":"ix2"}}*/
Translation of IN-list filters to grouped OR terms is limited by the lower of #max_inlist and the remaining number of scans below #max_scans at the time the list is processed. #max_inlist should be used to limit the number of scans used for processing of IN-list filters when it is preferable to apply other filters.

Ditto Instance Initialization

When initializing your Ditto instance, you may want to create indexes that are essential for your application’s performance. Since indexes are stored in the database and persist across application restarts, you need to handle cases where indexes might already exist from previous runs.

Initialization Patterns

There are three approaches for handling index creation during initialization:
  1. Use IF NOT EXISTS clause - The simplest and most efficient approach (recommended)
  2. Check before creating - Query system:indexes to see if the index exists
  3. Create and catch errors - Attempt to create the index and handle duplicate errors gracefully

Initialization Examples

// Approach 1: Use IF NOT EXISTS (recommended)
func initializeIndexes() async throws {
    let indexQueries = [
        "CREATE INDEX IF NOT EXISTS color_idx ON cars (color)",
        "CREATE INDEX IF NOT EXISTS name_idx ON users (name)",
        "CREATE INDEX IF NOT EXISTS date_idx ON orders (created_date)"
    ]

    for query in indexQueries {
        await ditto.store.execute(query: query)
        print("Index created or already exists")
    }
}

// Approach 2: Check if index exists before creating
func initializeIndexesWithCheck() async throws {
    let requiredIndexes = [
        ("cars", "color_idx", "color"),
        ("users", "name_idx", "name"),
        ("orders", "date_idx", "created_date")
    ]

    for (collection, indexName, field) in requiredIndexes {
        let existingIndex = await ditto.store.execute(
            query: "SELECT * FROM system:indexes WHERE collection = :collection AND _id = :indexId",
            arguments: ["collection": collection, "indexId": "\(collection).\(indexName)"])

        if existingIndex.items.isEmpty {
            await ditto.store.execute(
                query: "CREATE INDEX \(indexName) ON \(collection) (\(field))")
            print("Created index \(indexName) on \(collection)")
        } else {
            print("Index \(indexName) already exists on \(collection)")
        }
    }
}

// Approach 3: Create and handle errors
func initializeIndexesWithErrorHandling() async {
    let indexQueries = [
        "CREATE INDEX color_idx ON cars (color)",
        "CREATE INDEX name_idx ON users (name)",
        "CREATE INDEX date_idx ON orders (created_date)"
    ]

    for query in indexQueries {
        do {
            try await ditto.store.execute(query: query)
            print("Successfully created index")
        } catch {
            if error.localizedDescription.contains("already exists") {
                print("Index already exists, continuing...")
            } else {
                print("Failed to create index: \(error)")
                throw error
            }
        }
    }
}

Best Practices for Initialization

  1. Use IF NOT EXISTS for simplicity - The simplest and most efficient approach for most use cases
  2. Use the check-first approach for critical indexes - When you need detailed logging and control
  3. Batch index creation - Create all required indexes during initialization rather than on-demand
  4. Log index status - Always log whether indexes were created or already existed for debugging
  5. Handle errors gracefully - Don’t let index creation failures prevent your app from starting unless the indexes are absolutely critical
  6. Consider startup performance - The IF NOT EXISTS approach is fastest as it requires no additional queries

Error Handling

When working with indexes, you may encounter errors if an index with the same name already exists on a collection. This section covers how to handle such scenarios.

Duplicate Index Error

If you attempt to create an index with a name that already exists on the same collection, DQL will throw the following error:
Database error: an index with the given name already exists on the given collection
To handle this error, you have several options:
  1. Use IF NOT EXISTS clause - Prevents the error entirely (recommended)
  2. Check if the index exists first by querying system:indexes
  3. Drop the existing index before creating a new one (use DROP INDEX IF EXISTS for safety)
  4. Use a different index name
  5. Catch and handle the error in your application code

Error Handling Examples

// Approach 1: Use IF NOT EXISTS (recommended)
await ditto.store.execute(query: "CREATE INDEX IF NOT EXISTS color_idx ON cars (color)")
// No error handling needed - succeeds whether index exists or not

// Approach 2: Check if index exists before creating
let existingIndexes = await ditto.store.execute(
  query: "SELECT * FROM system:indexes WHERE collection = :collection AND _id = :indexName",
  arguments: ["collection": "cars", "indexName": "cars.color_idx"])

if existingIndexes.items.isEmpty {
  // Safe to create the index
  await ditto.store.execute(query: "CREATE INDEX color_idx ON cars (color)")
}

// Approach 3: Handle the error
do {
  try await ditto.store.execute(query: "CREATE INDEX color_idx ON cars (color)")
} catch {
  if error.localizedDescription.contains("already exists") {
    print("Index already exists, skipping creation")
  } else {
    throw error
  }
}

Best Practices for Error Handling

  1. Use idempotent index creation: Check if an index exists before attempting to create it
  2. Implement retry logic: For transient errors, implement appropriate retry mechanisms
  3. Log errors appropriately: Distinguish between expected errors (like duplicate indexes) and unexpected ones
  4. Consider using unique naming conventions: Include timestamps or version numbers in index names when appropriate

How it works

General Behavior

  • When an index is added to a collection, that collection must be scanned and indexed, which may take time for large collections
  • Index directives only suggest which index to consider. DQL still requires a valid reason (WHERE or ORDER BY clause) to actually use the index. An index won’t be used if the query doesn’t benefit from it.

Persistence

  • SDK Indexes are retained when Ditto shuts down
  • Index names are unique to a collection - creating an index with an existing name in the same collection will raise an error
  • To replace an existing index, you must first drop it, then recreate it

Missing Fields

  • SDK - Documents that don’t contain the indexed field will be included in index-optimized query results. (i.e. INCLUDE MISSING is implied).
  • Ditto Server HTTP API - By default, documents that don’t contain the indexed field will not be included in index-optimized query results. (i.e. EXCLUDE MISSING)

Supported APIs

API/FeatureIndex Support
execute SDK API✅ Supported
registerObserver SDK API✅ Supported
registerSubscription SDK API❌ Not Supported
Ditto Server HTTP API❌ Not Supported
SDKs with in-memory storage❌ Not Supported

Best Practices

  1. Create indexes on frequently queried fields: Focus on fields that appear often in your WHERE clauses
  2. Monitor index usage: Use the system:indexes collection to track your indexes
  3. Consider index creation cost: Adding indexes to large collections requires scanning and indexing time
  4. Clean up unused indexes: Drop indexes that are no longer needed to save storage space and improve write performance

Roadmap

DQL indexing focuses on core functionality for optimal performance. The following features are not currently supported:
  • Composite indexes on multiple fields
  • Partial indexes (e.g., CREATE INDEX ... 'WHERE status = 'active')
  • Functional indexes (e.g., UPPER(field1))
  • Tombstone indexing
  • Custom indexing on the Ditto Server
  • Array & Object support
Future releases may expand indexing capabilities based on user feedback and performance requirements.

Restrictions

When an unsupported operation is detected, the query will be serviced by a collection scan.
All predicates are applied as a filter on the documents matched by an index scan. This re-application of the index predicate along with initial application of any other predicates ensures the correct results.
In addition to other restrictions, these restrictions apply to version 4.12 only. Starting in v4.13, union and intersect scans allow using multiple indexes simultaneously:
  1. v4.12 only: At most only one index scan will be used in any given query. This means that:
    1. Logical OR is not supported
      1. Any operation that resolves to a logical OR is not supported
        1. e.g. IN (a,b,c) which logically resolves to (a OR b OR c)
        2. e.g. Inequality, e.g. a != 0 which resolves to (a < 0 OR a > 0)
      2. This includes when a branch is always false and would never contribute any documents to the result
        1. e.g a = 1 OR false
    2. Only one non-overlapping logical AND predicate with an appropriate supporting index is applied as an index scan
      1. Overlapping predicates are combined when possible, e.g. a > 0 AND a < 10, to form a single closed range and are thus considered a single predicate when it comes to index selection
      2. As only the overlapping range is used, something like (e.g.) a > 0 AND a = 1 is combined to be just a = 1
  2. The following filters are not supported by index scans:
    1. NOT (logical negation)
      1. This is true even when the logical result would be a supported operation
        1. e.g. Even though NOT a != 0 logically is a = 0, it won’t be supported by an index scan
  1. The following filters are not supported by index scans:
    1. Any non-trivial predicates like functional and case-insensitive operations
      1. e.g.
        1. LOWER(name) = 'xxx' (with an index on (name))
        2. ILIKE 'abc%'
      2. Non-trivial value expressions that can be statically evaluated to a non-composite type (see below) are used
        1. e.g. a = :param % 2 as :param % 2 can be evaluated directly
    2. Composite value filters, e.g. (with an index on (a))
      1. a = [1,2,3]
      2. a = {'b':'c'}
    3. Filtering on a sub-field, e.g. (with only an index on (a))
      1. a.b = 1 will not use an index scan; you must index the full path you wish to filter on
    4. Filtering on an element when an indexed field is an ARRAY type
      1. e.g. a[2] = 's' with an index on (a) will use an index scan but all documents will be retrieved by it (i.e. no advantage over a collection scan)
    5. Filtering on a non-deterministic value
      1. e.g. a > clock() (clock() is non-deterministic)
  2. The choice of which predicate to use and therefore which index is selected is rule based
    1. Predicates are preferred in this order:
      1. Equality / IS NULL / IS MISSING
      2. Closed ranges (e.g. a > 0 AND a < 10) with no preference for inclusive or exclusive boundaries
        1. LIKE (case sensitive) with a constant prefix can be supported by a closed range based on the prefix
      3. Less than (or less than or equal to)
      4. Greater than (or greater than or equal to)
      5. The order of the available index list (which is alphabetically sorted on index name)
    2. If an index can offer ordering along with filtering it is preferred over one that provides only filtering
      1. e.g. ix1 on (b) is preferred over ix0 on (a) (or an intersection of ix1 and ix0) for a = 1 AND b > 1 ORDER BY b
      2. Statement ordering direction (ASC/DESC) must match the index direction
        1. The index key must be the leading sorting term
      3. Partial ordering will not override preferred filters
        1. e.g. an intersection of ix0 on (a) and ix1 on (b) is preferred for a = 1 AND b > 1 ORDER BY b,a
    As the planning is rule based there is no consideration for actual selectivity of different filters. If a particular filter is known to be more selective, directives can be used to direct planning.
    Use explain to show the chosen access plan.
  3. There is a limit to the number of index scans a statement will use.
    1. If possible the planner will apply all applicable index scans up to this limit and remaining predicates are applied as post-fetch filtering.
    2. All branches of a statement (i.e. OR-ed terms) must be serviced by an index scan; if any aren’t then a collection scan is necessary and therefore no index scans will be used.
    3. The total number of scans permitted is controlled by the #max_scans directive.
  4. There is a limit to the number of intersections (AND-ed terms) applied in any branch.
    1. Terms not applied with an index scan are applied as post-fetch filters.
    2. The limit is controlled by the #intersects directive.