Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.ditto.live/llms.txt

Use this file to discover all available pages before exploring further.

DQL now supports projections and aggregates in addition to SELECT * operations. You can select specific fields, perform calculations, and use aggregate functions like COUNT, SUM, AVG, MIN, and MAX.

SELECT Statements

The following syntax outlines the basic structure and optional clauses you can use within your SELECT statements:
DQL
SELECT [DISTINCT] projection
FROM your_collection_name
[WHERE condition]
[GROUP BY groupby_expression_1, groupby_expression_2, ...]
[HAVING condition]
[ORDER BY orderby_expression_1, orderby_expression_2, ... [ASC|DESC]]
[LIMIT limit_value]
[OFFSET number_of_documents_to_skip]
SELECT Syntax Diagram Where projection can be:
  • * - Returns all fields from the documents
  • field1, field2, ... - Returns specific fields
  • expression AS alias - Returns calculated values with custom names
  • Aggregate functions like COUNT(*), SUM(field), etc.
For instance, retrieve all documents in the cars collection WHERE the color property is set to the value ‘blue’ :
DQL
SELECT * FROM cars WHERE color = 'blue'

Clauses for Filtering

The following table provides an overview of the different clauses you can use to define specific conditions and calculations within your DQL SELECT statements to provide more granular control over your queries:
ClauseDescription
FROMThe required clause specifying the collection containing the documents for retrieval. (See FROM)
WHEREApplies filtering conditions to restrict the documents included in the result set. (See WHERE)
GROUP BYGroups documents based on one or more expressions for aggregate calculations. (See GROUP BY)
HAVINGFilters grouped results based on aggregate conditions. (See HAVING)
ORDER BYSpecifies the sorting order of the result set based on one or more expressions. (See ORDER BY)
LIMITRestricts the number of documents included in the result set. (See LIMIT)
OFFSETSkips a specific number of documents before returning the result set. (See OFFSET)

FROM

Required in each SELECT statement you write in DQL, the FROM element identifies the collection for document retrieval.
DQL
SELECT *
FROM your_collection_name
FROM Clause Syntax Diagram For example, a SELECT statement querying documents from the cars collection:
DQL
SELECT * FROM cars

USE IDS

The optional USE IDS clause allows you to efficiently retrieve specific documents by their IDs without performing a collection scan. This is more performant than using WHERE _id IN (...) filters. Syntax:
DQL
-- Inline IDs
SELECT * FROM collection_name USE IDS (id1, id2, id3)

-- Using a parameter array
SELECT * FROM collection_name USE IDS LIST :id_array
USE IDS Syntax Diagram Examples:
DQL
-- Retrieve specific documents by ID
SELECT * FROM cars USE IDS ('123', '456', '789')

-- Using a parameter array (recommended for dynamic queries)
SELECT * FROM cars USE IDS LIST :car_ids
let carIds = ["123", "456", "789"]
let results = await ditto.store.execute(
  query: "SELECT * FROM cars USE IDS LIST :car_ids",
  arguments: ["car_ids": carIds]
)
Performance Tip: Where possible, simple equality filters on _id (like WHERE _id = '123') are automatically optimized internally similar to USE IDS. However, for explicit control and guaranteed performance, USE IDS is recommended. See also the #auto_use_ids directive in Directives.

Collection Aliasing

You can assign an alias to a collection in the FROM clause to create shorter, more readable queries. The AS keyword is optional:
DQL
-- With AS keyword
SELECT * FROM cars AS c WHERE c.color = 'blue'

-- Without AS keyword
SELECT * FROM cars c WHERE c.color = 'blue'
Collection aliases are particularly useful in:
  • Complex queries: Shortening long collection names for readability
  • Qualified field references: Explicitly referencing fields from a specific collection
  • Query directives: Specifying collection-level directives (see Directives)
DQL
-- Using collection alias with qualified field references
SELECT c.make, c.model, c.year
FROM cars c
WHERE c.color = 'blue'
ORDER BY c.year DESC

-- Using collection alias with directives
/*+ {"c": {"#index": "ix_color"}} */
SELECT * FROM cars c WHERE c.color = 'blue'
Collection aliases follow the same identifier rules as projection aliases and can also use backticks for special characters or reserved words.

WHERE

The WHERE clause filters data based on either an expression or a set of conditions that narrow the result set returned to you:
DQL
SELECT *
FROM your_collection_name
WHERE [condition]
WHERE Clause Syntax Diagram For example, here is a basic SELECT statement querying documents from the cars collection based on a given address:
DQL
SELECT * FROM cars
WHERE location.address = '123 Main St, San Francisco, CA 98105'
To demonstrate a more complex query, here is a SELECT statement that queries using multiple expressions and logical operators to further refine the criteria for document retrieval:
DQL

SELECT * FROM cars
WHERE color = 'blue' AND features.trim = 'Standard' OR features.mileage > 10000

ORDER BY

With the ORDER BY clause, if you’d like, you can integrate calculations or expressions in your SELECT statement. Then sort the resulting documents to return in either ascending (ASC) or descending (DESC) alphabetical order:
DQL
SELECT *
FROM your_collection_name
ORDER BY expression_1, expression_2, ... [ASC|DESC]
ORDER BY Clause Syntax Diagram For example, here is a simple SELECT statement that uses the ORDER BY clause to query and sort documents from the cars collection in descending (DESC) alphabetical order based on the field value set for the color property:
DQL
SELECT *
FROM cars ORDER BY color DESC
In this syntax:
  • your_collection_name is the name of the collection from which you want to retrieve the data.
  • expression_1, expression_2, ... are the expressions evaluated to sort the result. Expressions are resolved in order.
  • [ASC|DESC] is an optional parameter that specifies the sort order. If omitted, the default sort order is ascending (ASC). To sort in descending order, you can specify DESC.
Example In this example, the result set from the query will be sorted in descending order based on the values in the field:
DQL
-- Sort by a given field_name
SELECT *
FROM your_collection_name
ORDER BY field_name DESC
For instance, here "blue" cars return first and other cars sort by the natural order in the collection:
DQL
SELECT *
FROM cars
ORDER BY color = 'blue'

Sort Order by Object Type

In DQL, the hierarchy for comparing and sorting objects varies based on the following criteria:
  • If ascending (ASC) order operations:
    • boolean
    • number
    • binary
    • string
    • array
    • object
    • null
    • missing
  • If descending (DESC) order operations, sorting order is reversed:
    • missing
    • null
    • object
    • array
    • string
    • binary
    • number
    • boolean
  • If evaluating values, true results are prioritized and ordered first followed by false results.

Expressing Sort Order

Unless explicitly defined as DESC in your query, Ditto defaults to sorting in ascending (ASC). So, if you want to sort in ascending order, you do not have to express that in your query.

LIMIT

The LIMIT clause is used to restrict the number of documents returned by a query, allowing you to specify a maximum limit on the number of documents to be included in the result set:
DQL
SELECT *
FROM your_collection_name
LIMIT limit_value
In this syntax:
  • your_collection_name is the name of the collection from which you want to retrieve the data.
  • limit_value is the maximum number of documents you want to include in the result set.
For example, only return the first 10 documents from the your_collection_name collection:
DQL
SELECT *
FROM your_collection_name
LIMIT 10

OFFSET

The OFFSET clause is used to specify the number of records to skip before starting to return documents from the query result:
DQL
SELECT *
FROM your_collection_name
OFFSET number_of_items_to_skip
In this syntax:
  • your_collection_name is the name of the collection from which you want to retrieve the data.
  • number_of_items_to_skip is the number of items before returning the result set.
Using OFFSET with LIMIT is a common way to utilize OFFSET; for example:
DQL
SELECT *
FROM your_collection_name
LIMIT 10
OFFSET 10

Projections

Projections allow you to specify exactly which fields or calculated values to return from your query, rather than returning entire documents. This reduces data transfer and processing overhead. Projection Syntax Diagram

Basic Field Selection

Select specific fields from documents:
DQL
SELECT make, model, year FROM cars

Excluding Fields

To exclude specific fields from a projection, use the MISSING keyword. This can be combined with * to select all fields except those explicitly excluded:
DQL
-- Given a document: { "_id": "abc", "make": "Toyota", "model": "Camry", "color": "blue", "vin": "1234" }

-- Exclude "vin" and "color", keeping everything else:
SELECT cars.*, MISSING vin, MISSING color FROM cars
-- Result: { "_id": "abc", "make": "Toyota", "model": "Camry" }

Aliasing

Use aliases to rename fields in your results:
DQL
SELECT make AS manufacturer, model AS car_model FROM cars
Default aliases are assigned if not specified:
  • Simple field references use the field name as the alias
  • Expressions get aliases like ($1), ($2), etc.
Aliases bind to the element immediately prior in the statement. SELECT COUNT(*) AS car_count FROM cars names the output field, but SELECT COUNT(*) FROM cars AS car_count aliases the cars collection — COUNT(*) falls back to its default alias.

Alias Constraints

Aliases must follow these rules:
  • Uniqueness: Each alias in a SELECT list must be unique. Duplicate aliases will raise an error.
  • Identifier Rules: Aliases must be valid identifiers following field naming conventions.
  • Special Characters: Use backticks (grave accents) to quote aliases containing special characters or reserved words.
DQL
-- Using backticks for special characters
SELECT price AS `final-price`, model AS `car-model` FROM cars

-- Using backticks for reserved words
SELECT status AS `order`, type AS `select` FROM orders

Alias Scope

Projection aliases defined in the SELECT list can be referenced by the ORDER BY clause to sort by an aliased expression:
DQL
SELECT make, price * 0.9 AS sale_price FROM cars ORDER BY sale_price DESC
GROUP BY and HAVING do not accept projection aliases. They must reference the underlying source expression or aggregate function directly: use GROUP BY color (the source field) rather than GROUP BY c (a projection alias), and HAVING COUNT(*) > 5 (the aggregate) rather than HAVING car_count > 5 (the alias).

Calculated Fields

Create new fields using expressions:
DQL
SELECT make, model, price * 0.9 AS discounted_price FROM cars

DISTINCT Results

Remove duplicate rows from your results:
DQL
SELECT DISTINCT color FROM cars
DISTINCT Performance Considerations:DISTINCT has performance implications you should understand:Memory Buffering:
  • All projections must be buffered in memory to enforce uniqueness
  • This increases memory requirements proportional to the number of unique results
  • Memory usage grows with the result set size
Redundant Usage:
  • When the _id field is included (e.g., SELECT DISTINCT *), DISTINCT is redundant
  • Each document already has a unique _id, making DISTINCT unnecessary
  • This adds overhead without benefit - avoid in these cases
Best Practices:
  • Only use DISTINCT when you truly need unique combinations of projected fields
  • Avoid DISTINCT when projecting complete documents (SELECT DISTINCT *)
  • Consider if your application logic can handle duplicates instead
  • For large result sets, evaluate if DISTINCT is worth the memory cost

Aggregate Functions

Aggregate functions perform calculations across multiple documents and return a single result. DQL supports the following aggregate functions:
For a quick reference table of all aggregate functions and their syntax, see Operator Expressions - Aggregate Functions.

COUNT

Count documents or non-null values:
DQL
-- Count all documents
SELECT COUNT(*) FROM cars

-- Count non-null values in a field
SELECT COUNT(color) FROM cars

-- Count distinct values
SELECT COUNT(DISTINCT color) FROM cars

SUM

Calculate the sum of numeric values:
DQL
SELECT SUM(price) AS total_value FROM cars

-- Sum distinct values only
SELECT SUM(DISTINCT price) FROM cars

AVG

Calculate the average of numeric values:
DQL
SELECT AVG(mileage) AS average_mileage FROM cars

MIN and MAX

Find minimum and maximum values:
DQL
SELECT MIN(year) AS oldest_year, MAX(year) AS newest_year FROM cars

MID

Calculate the midpoint between minimum and maximum values:
DQL
SELECT MID(price) AS midpoint_price FROM cars

-- With DISTINCT
SELECT MID(DISTINCT price) FROM cars
The MID function returns the midpoint value between MIN(expr) and MAX(expr). Non-numeric values are silently ignored.

MEDIAN

Calculate the median (middle) value:
DQL
SELECT MEDIAN(price) AS median_price FROM cars

-- With DISTINCT
SELECT MEDIAN(DISTINCT mileage) FROM cars
The MEDIAN function returns the middle value when all values are sorted. Non-numeric values are silently ignored.

Combining Aggregates

Use multiple aggregate functions in a single query:
DQL
SELECT 
  COUNT(*) AS total_cars,
  AVG(price) AS avg_price,
  MIN(price) AS lowest_price,
  MAX(price) AS highest_price
FROM cars

GROUP BY

Group results and calculate aggregates for each group:
DQL
SELECT color, COUNT(*) AS car_count, AVG(price) AS avg_price
FROM cars
GROUP BY color
GROUP BY Clause Syntax Diagram When using aggregates with GROUP BY, non-aggregate projections must be part of the GROUP BY clause:
DQL
-- This works: 'make' is in GROUP BY
SELECT make, COUNT(*) AS model_count 
FROM cars 
GROUP BY make

-- This would fail: 'model' is not in GROUP BY
-- SELECT make, model, COUNT(*) FROM cars GROUP BY make

HAVING

Filter grouped results based on aggregate values:
DQL
SELECT color, COUNT(*) AS car_count
FROM cars
GROUP BY color
HAVING COUNT(*) > 5
HAVING Clause Syntax Diagram
Aggregate functions form a “dam” in the execution pipeline - all documents must be processed before results can be returned. This differs from non-aggregate queries which can stream results.

Aggregate Function Behavior

Understanding how aggregate functions work internally helps optimize query performance: Memory Requirements:
  • Aggregates accumulate results per group, so memory usage depends on the number of groups
  • A small number of groups requires little memory regardless of document count
  • A large number of groups increases memory requirements proportionally
  • DISTINCT aggregates must buffer all distinct values in memory, which can significantly increase memory usage with many unique values
MISSING Values:
  • If no documents qualify for an aggregate in a group, the result is MISSING (absent from results)
  • To always include a result, use conditional functions: SUM(IFMISSING(field, 0))
  • MIN and MAX silently ignore MISSING values
  • COUNT does not count NULL, MISSING, or FALSE values unless explicitly handled
Type Handling:
  • SUM, AVG, MID, and MEDIAN silently ignore non-numeric values
  • MIN and MAX compare values using Ditto’s standard type ordering rules
  • COUNT(*) is equivalent to COUNT(true) with no additional overhead
Performance Tips:
  • Use COUNT(*) instead of COUNT(field) when counting all documents
  • Avoid DISTINCT with aggregates unless necessary due to memory overhead
  • Minimize the number of groups in GROUP BY for better memory efficiency
  • Use HAVING to filter groups rather than filtering after aggregation

Advanced Aggregate Examples

Handling MISSING Values:
DQL
-- Always return 0 instead of MISSING when no documents match
SELECT IFMISSING(SUM(price), 0) AS total_sales FROM orders WHERE status = 'completed'

-- Use default value for missing fields during aggregation
SELECT SUM(IFMISSING(price, 0)) AS total_with_defaults FROM products

-- Count all non-missing, non-null values regardless of truthiness
SELECT COUNT(NOT ISMISSINGORNULL(rating)) AS rated_items FROM reviews
Conditional Aggregation with CASE:
DQL
-- Sum order values by status within a single query
SELECT
  COUNT(*) AS total_orders,
  SUM(CASE WHEN status = 'completed' THEN final_price ELSE 0 END) AS completed_value,
  SUM(CASE WHEN status = 'pending' THEN price * discount ELSE 0 END) AS pending_value,
  SUM(CASE WHEN status = 'cancelled' THEN price ELSE 0 END) AS cancelled_value
FROM orders

-- Calculate weighted averages
SELECT
  category,
  AVG(CASE WHEN priority = 'high' THEN price * 1.5 ELSE price END) AS weighted_avg_price
FROM products
GROUP BY category

-- Sum with conditional multipliers
SELECT
  SUM(CASE
    WHEN quantity > 100 THEN price * quantity * 0.9
    WHEN quantity > 50 THEN price * quantity * 0.95
    ELSE price * quantity
  END) AS total_revenue
FROM order_items
Complex Business Logic:
DQL
-- Calculate profit margin by category
SELECT
  category,
  (SUM(sale_price) - SUM(cost_price)) / SUM(sale_price) * 100 AS profit_margin_pct,
  COUNT(*) AS items_sold
FROM sales
WHERE sale_date >= '2024-01-01'
GROUP BY category
HAVING (SUM(sale_price) - SUM(cost_price)) > 1000

-- Customer lifetime value with fallback for missing data
SELECT
  customer_id,
  COUNT(*) AS total_orders,
  SUM(IFMISSING(order_total, 0)) AS lifetime_value,
  AVG(IFMISSING(order_total, 0)) AS avg_order_value,
  MAX(order_date) AS last_order_date
FROM orders
GROUP BY customer_id
HAVING SUM(IFMISSING(order_total, 0)) > 500

-- Inventory health metrics
SELECT
  warehouse_id,
  COUNT(*) AS total_items,
  SUM(CASE WHEN stock_level < reorder_point THEN 1 ELSE 0 END) AS items_needing_reorder,
  SUM(CASE WHEN stock_level = 0 THEN 1 ELSE 0 END) AS out_of_stock_items,
  AVG(IFMISSING(stock_level, 0)) AS avg_stock_level
FROM inventory
GROUP BY warehouse_id
Filtering Within Aggregates:
DQL
-- Count only items matching specific conditions
SELECT
  department,
  COUNT(*) AS total_employees,
  COUNT(CASE WHEN salary > 100000 THEN 1 END) AS high_earners,
  AVG(CASE WHEN years_experience > 5 THEN salary END) AS avg_senior_salary
FROM employees
GROUP BY department

-- Calculate percentages within groups
SELECT
  region,
  COUNT(*) AS total_sales,
  SUM(CASE WHEN product_type = 'premium' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) AS premium_pct
FROM sales
GROUP BY region
Handling NULL and MISSING Gracefully:
DQL
-- Coalesce with multiple fallbacks
SELECT
  product_id,
  COUNT(*) AS review_count,
  AVG(COALESCE(rating, average_category_rating, 3.0)) AS effective_rating
FROM reviews
GROUP BY product_id

-- Conditional counting with null handling
SELECT
  status,
  COUNT(*) AS total,
  COUNT(completed_date) AS completed_with_date,
  COUNT(*) - COUNT(completed_date) AS missing_dates
FROM tasks
GROUP BY status

-- Sum with NULL protection
SELECT
  order_id,
  SUM(IFNULL(item_price * quantity, 0)) AS order_total,
  COUNT(*) AS item_count,
  SUM(IFNULL(discount_amount, 0)) AS total_discount
FROM order_items
GROUP BY order_id
Multi-level Aggregations:
DQL
-- Nested calculations with aggregates
SELECT
  category,
  COUNT(*) AS product_count,
  SUM(price * inventory_count) AS total_inventory_value,
  SUM(price * inventory_count) / COUNT(*) AS avg_value_per_product,
  MAX(price) - MIN(price) AS price_range
FROM products
GROUP BY category
HAVING COUNT(*) >= 5

-- Time-based aggregations with conditionals
SELECT
  DATE_FORMAT(created_at,'YYYY-MM-DD') AS order_date,
  COUNT(*) AS daily_orders,
  SUM(IFMISSING(total, 0)) AS daily_revenue,
  AVG(CASE WHEN total > 0 THEN total END) AS avg_order_value_excluding_zero
FROM orders
WHERE created_at >= DATE_CAST('2024-01-01','YYYY-MM-DD')
GROUP BY DATE_FORMAT(created_at,'YYYY-MM-DD')
ORDER BY order_date DESC
Working with Arrays and Complex Fields:
DQL
-- Aggregate with array transformations
SELECT
  seller_id,
  COUNT(*) AS total_products,
  AVG(array_length(tags)) AS avg_tags_per_product,
  SUM(CASE WHEN array_length(tags) > 5 THEN 1 ELSE 0 END) AS well_tagged_products
FROM products
GROUP BY seller_id

-- Conditional aggregation on nested fields
SELECT
  category,
  COUNT(*) AS total_items,
  AVG(IFMISSING(details.weight, 0)) AS avg_weight,
  SUM(CASE WHEN details.fragile = true THEN 1 ELSE 0 END) AS fragile_count
FROM inventory
GROUP BY category