WOQL Control Flow: Backtracking, Scope, and Performance

WOQL queries look declarative — you describe what you want and the engine figures out the rest. But underneath, the engine makes choices about how to find answers, and those choices follow specific rules. Understanding these rules is the difference between a query that streams results in milliseconds and one that materializes an entire dataset into memory before returning anything.

This page explains the execution model from the ground up: how solutions are generated, how variables move between inner and outer scopes, and how to structure queries so the engine can do its job efficiently.

Prerequisites: This page assumes you've completed the WOQL Tutorial and understand basic concepts like and, or, triple, and eq. If terms like "binding" and "variable" are unfamiliar, start there.

See also: WOQL Explanation | WOQL Tutorial | Range Queries with triple_slice | Time Handling | WOQL Class Reference

The Execution Model: Solutions and Backtracking

What is a solution?

A WOQL query produces a table of results. Each row is a solution — a complete assignment of values to variables that satisfies every constraint in the query. The engine's job is to find all such solutions.

Example: JavaScript

// This query has exactly 1 solution:
eq("v:x", 42)
// Result: x = 42

// This query has 0 solutions:
and(eq("v:x", 42), eq("v:x", 99))
// No value of x satisfies both constraints simultaneously

How backtracking works

The engine evaluates a query left to right. When it reaches a predicate that can produce multiple results — like triple, or, or member — it picks the first result and continues forward. If a later constraint fails, the engine backtracks to the most recent choice point and tries the next option.

Think of it as exploring a tree of possibilities:

Example: Text

and(
  triple(v.person, "city", v.city),     ← choice point: many people
  triple(v.person, "age", v.age),       ← for each person, look up age
  greater(v.age, 30)                    ← filter: does age > 30?
)

The engine:

Picks the first person/city triple.
Looks up that person's age.
Checks if age > 30. If yes, emits a solution. If no, backtracks to step 1.
Picks the next person/city triple. Repeats.

This is backtracking — the engine systematically explores all possibilities by going forward when constraints succeed and going backward when they fail.

Deterministic vs. non-deterministic predicates

Every WOQL predicate falls into one of two categories:

Category	Behavior	Examples
Non-deterministic (generators)	Can produce multiple solutions via backtracking	`triple`, `triple_slice`, `member`, `or`, `sequence`, `path`
Deterministic (filters/transforms)	Produce exactly one solution or fail	`eq`, `greater`, `less`, `not`, `typecast`, `length`, `sum`, `group_by`, `limit`, `distinct`

Non-deterministic predicates are choice points — they are where the engine branches. Deterministic predicates either pass or block a solution, but never multiply the number of solutions.

Understanding this distinction is crucial for reasoning about how many rows a query will return and where the computational work happens.

Variable Scope: Inner and Outer Queries

The default: all variables are globally scoped

In a flat WOQL query, all variables live in the same scope. A binding made anywhere is visible everywhere:

Example: JavaScript

and(
  eq("v:x", 42),
  eq("v:y", "v:x")   // v:x is already bound to 42, so v:y = 42
)

This is simple and intuitive for flat queries. It gets more interesting with sub-queries.

`select` — restricting which variables escape

select runs a sub-query but only exposes the listed variables to the outer query. Variables inside the sub-query that are not selected are invisible outside:

Example: JavaScript

and(
  select("v:name",
    and(
      triple("v:person", "name", "v:name"),
      triple("v:person", "age", "v:age")     // v:age is computed but not exposed
    )
  ),
  eq("v:age", "unknown")  // v:age is unbound here — select hid it
)

When to use select: When a sub-query binds helper variables that would pollute the outer scope or cause unintended unification. It is the primary tool for variable hygiene.

`group_by` — grouping with scoped aggregation

group_by runs a sub-query, groups the results by one or more variables, and collects the grouped values into lists. The sub-query's variables are scoped — only the grouping variables and the collected list are visible outside.

Example: JavaScript

// Count people per city
and(
  group_by(
    ["city"],                                          // group by these variables
    ["person"],                                        // collect these into lists
    "v:city_group",                                    // the collected list
    and(
      isa("v:person", "Person"),
      triple("v:person", "city", "v:city")
    )
  ),
  length("v:city_group", "v:count")
)

Key scoping behavior:

v:city is the grouping variable — it is bound in the outer scope (one value per group).
v:person is the template variable — its individual values are not visible outside, only the collected list is.
v:city_group is the result list — visible in the outer scope.
The inner query runs to exhaustion for each group.

Common pitfall: Using group_by with an empty grouping key just to collect all values into a list:

Example: JavaScript

// Verbose way to collect all ages into a list
group_by(
  [],
  ["age"],
  "v:all_ages",
  triple("v:person", "age", "v:age")
)

This works but is semantically misleading — you're not grouping by anything. The proposed Collect predicate (see below) makes this intent explicit.

`Collect` — gathering bindings into a list (proposed)

Collect is the inverse of Member. Where Member unpacks a list into individual bindings, Collect runs a sub-query to exhaustion and gathers all values of a template variable into a list.

Example: JavaScript

// Collect all ages into a list
WOQL.collect(
  WOQL.var("age"),
  WOQL.var("all_ages"),
  WOQL.triple(WOQL.var("person"), "age", WOQL.var("age"))
)
// Result: all_ages = [28, 35, 28, 42, 31]

Scoping behavior:

The query parameter is a self-contained sub-query. It runs independently.
The into variable receives the resulting list in the outer scope.
The template specifies what to collect from each solution (last to support fluent queries)
Variables in the sub-query that are not in the template are existentially quantified — they exist only to help generate solutions.

The key insight: Collect must control the backtracking. You cannot collect values from a variable that is already backtracking in the outer scope — the sub-query that generates the values must be inside the Collect. This is the same constraint as group_by and the findall Prolog predicate.

Example: JavaScript

// This does NOT collect outer backtracking into a list:
and(
  triple("v:person", "age", "v:age"),   // generates many bindings
  WOQL.collect("v:age", ???, "v:list")  // too late — v:age is already bound
)

// This DOES work — the generating query is inside Collect:
WOQL.collect(
  WOQL.var("age"),
  WOQL.triple(WOQL.var("person"), "age", WOQL.var("age")),
  WOQL.var("all_ages")
)

`member` — unpacking a list into bindings

Member is the inverse of Collect. Given a list, it generates one solution per element:

Example: JavaScript

and(
  eq("v:fruits", ["apple", "banana", "cherry"]),
  member("v:fruit", "v:fruits")
)
// Result: 3 rows — fruit = "apple", fruit = "banana", fruit = "cherry"

Member is non-deterministic — it creates a choice point with as many branches as there are list elements. Each branch produces one binding.

The Member/Collect symmetry:

Predicate	Direction	Behavior
`Member`	List → Bindings	Non-deterministic: generates one solution per element
`Collect`	Bindings → List	Deterministic: runs sub-query to exhaustion, produces one solution

`distinct` — deduplication within a scope

distinct eliminates duplicate values for a specified variable while preserving backtracking for everything else:

Example: JavaScript

distinct("v:city",
  triple("v:person", "city", "v:city")
)
// Returns each city once, even if multiple people live there

Scoping behavior: distinct observes the values of the specified variable as the inner query generates them, and suppresses solutions that would repeat a previously seen value. The inner query still runs normally — distinct is a filter on the output stream.

`not` — negation as failure

not succeeds when its inner query fails (produces zero solutions). It does not bind any variables:

Example: JavaScript

and(
  isa("v:person", "Person"),
  triple("v:person", "name", "v:name"),
  not(triple("v:person", "email", "v:email"))
)
// People without an email address

Critical scoping rule: Variables that appear only inside not are existentially quantified. not checks whether any solution exists — it does not expose which specific values matched. If you need to know which values were excluded, use a different pattern (e.g., opt + filter).

`opt` — optional matching

opt tries to match its inner query. If the inner query succeeds, its bindings flow to the outer scope. If it fails, the outer query continues with those variables unbound:

Example: JavaScript

and(
  triple("v:person", "name", "v:name"),
  opt(triple("v:person", "email", "v:email"))
)
// All people, with email if they have one

Scoping behavior: Unlike not, opt does expose inner variable bindings when the inner query succeeds. When it fails, those variables remain unbound (empty in the result).

Structuring Queries for Performance

Principle 1: Put the most selective constraint first

The engine evaluates left to right. Constraints that eliminate more possibilities early mean less backtracking overall:

Example: JavaScript

// Slow: scan all triples, then filter
and(
  triple("v:person", "name", "v:name"),
  triple("v:person", "city", "v:city"),
  triple("v:person", "age", "v:age"),
  greater("v:age", 60)
)

// Faster: if few people are over 60, this is easiest to reason about
// (But only if you can — triple doesn't support filtering and will materialize every triple for the non-grounded variable)

In practice, triple always scans all matching triples for a given predicate. The real optimization opportunity is choosing between triple and triple_slice.

Principle 2: Use `triple_slice` for range constraints

triple iterates every triple for a predicate. triple_slice binary-searches the sorted value dictionary and only touches values in the range:

Example: JavaScript

// O(n) — scans all timestamps, then filters
and(
  triple("v:doc", "timestamp", "v:time"),
  greater("v:time", literal("2025-01-01T00:00:00Z", "xsd:dateTime")),
  less("v:time", literal("2025-02-01T00:00:00Z", "xsd:dateTime"))
)

// O(log n) — binary search, only touches matching values
triple_slice("v:doc", "timestamp", "v:time",
  "2025-01-01T00:00:00Z", "2025-02-01T00:00:00Z")

Both return the same results. The performance difference grows with dataset size. For time-series data or large numeric ranges, triple_slice is almost always the right choice.

See the triple_slice guide for full details on binding modes, type inference, and worked examples.

Principle 3: Understand what streams and what materializes

WOQL predicates fall into two performance categories:

Streaming predicates yield one solution at a time, using constant memory regardless of how many total solutions exist:

triple / quad
triple_slice / quad_slice
member
sequence
path
and, or (composition)

Materializing predicates must collect all solutions from their sub-query before producing output. Memory usage scales with the number of inner solutions:

group_by — groups all solutions, then emits groups one at a time
Collect — collects all template values into a list in memory
distinct — tracks all seen values to detect duplicates
order_by — must see all solutions before it can sort
aggregate / count / sum / length (on query results)

The practical consequence: If a sub-query inside group_by or Collect produces millions of solutions, those millions are held in memory. For very large result sets, consider whether you can narrow the sub-query with limit, triple_slice, or more selective constraints.

Principle 4: Streaming responses — results flow as they complete

When you execute a WOQL query via the API, TerminusDB can stream results back as they are found. The client receives the first solution as soon as it is computed — it does not have to wait for all solutions to be found.

This means:

Streaming queries (built from triple, and, or, etc.) start returning results immediately. Time-to-first-result is fast even for queries with millions of solutions.
Materializing queries (containing group_by, order_by, etc.) must finish their inner computation before the first result streams out. Time-to-first-result depends on the inner query's total execution time.

Designing for responsiveness: If your application can process results incrementally (e.g., displaying a table row by row), prefer streaming query shapes. If you need aggregation, do it as late as possible in the query so the streaming portion runs first.

Principle 5: Leverage bound variables for index access

triple uses different internal access patterns depending on which arguments are bound, examples:

Subject	Predicate	Object	Access pattern
bound	bound	unbound	Direct lookup — fast
bound	unbound	unbound	Iterate predicates for one subject
unbound	bound	unbound	Iterate subjects for one predicate
unbound	unbound	unbound	Full scan — slowest

The engine is fastest when it can use bound arguments to narrow the search. A common pattern is to bind the subject first (via a previous triple or isa), then look up specific properties:

Example: JavaScript

// Good: v.person is bound by isa, then used to look up properties directly
and(
  isa("v:person", "Person"),
  triple("v:person", "name", "v:name"),
  triple("v:person", "age", "v:age")
)

When you use and, the shared variable v:person is bound by the first predicate and used as an index key by the subsequent ones. This is the natural join pattern — and it is efficient because each subsequent triple does a direct lookup rather than a scan.

Principle 6: Compose `Collect` with list operators

Rather than building custom aggregation logic, use Collect to bridge query results into lists, then apply existing list operators:

Example: JavaScript

// Min salary
and(
  WOQL.collect(
    WOQL.var("salary"),
    WOQL.triple(WOQL.var("emp"), "salary", WOQL.var("salary")),
    WOQL.var("salaries")
  ),
  WOQL.range_min(WOQL.var("salaries"), WOQL.var("lowest"))
)

// Count of results
and(
  WOQL.collect(
    WOQL.var("doc"),
    WOQL.isa(WOQL.var("doc"), "Person"),
    WOQL.var("docs")
  ),
  WOQL.length(WOQL.var("docs"), WOQL.var("count"))
)

// Window of results
and(
  WOQL.collect(
    WOQL.var("x"),
    WOQL.sequence(WOQL.var("x"), 0, 100),
    WOQL.var("all")
  ),
  WOQL.slice(WOQL.var("all"), WOQL.var("window"), 10, 20)
)

This pattern keeps the API surface small — one new predicate (Collect) unlocks all existing list operators (RangeMin, RangeMax, Slice, Length, Sum, ListToSet, SetDifference, SetIntersection, SetUnion) for use over query results.

Predicate Quick Reference: Control Flow Properties

Predicate	Deterministic?	Streams?	Scopes variables?	Description
`and`	—	Yes	No (shared scope)	All constraints must hold
`or`	No	Yes	No (shared scope)	Any branch can hold; one row per successful branch
`triple`	No	Yes	No	Pattern match on subject-predicate-object
`triple_slice`	No	Yes	No	Range-constrained triple with O(log n) access
`triple_slice_rev`	No	Yes	No	Range-constrained triple with O(log n) access, reversed
`triple_first`	No	Yes	No	Convenience range function, it's a limit(1) of the slice
`triple_last`	No	Yes	No	Convenience range function, it's a limit(1) of the reverse slice
`member`	No	Yes	No	Generates one binding per list element
`sequence`	No	Yes	No	Generates values in a range
`path`	No	Yes	No	Graph traversal with regex-like path expressions
`select`	—	Yes	Yes — hides non-selected variables	Filter which variables are exposed
`distinct`	—	Partial	No	Deduplicates; must track seen values
`not`	Yes	—	Yes — inner variables are existential	Succeeds when inner query fails
`opt`	—	Yes	Partial — bindings escape on success	Optional match; continues on failure
`group_by`	Yes	No (materializes)	Yes — template variables are scoped	Groups results, collects into lists
`collect`	Yes	No (materializes)	Yes — sub-query is self-contained	Gathers all template values into a list
`order_by`	—	No (materializes)	No	Sorts all solutions
`limit`	—	Yes	No	Caps the number of solutions
`eq`	Yes	—	No	Unifies two values
`greater` / `less`	Yes	—	No	Strict comparison filter

Patterns and Anti-Patterns

Pattern: Narrow early, expand late

Example: JavaScript

// Good: isa + triple_slice narrows to a small set, then join
and(
  isa("v:doc", "SensorReading"),
  triple_slice("v:doc", "timestamp", "v:time",
    "2025-01-01T00:00:00Z", "2025-01-02T00:00:00Z"),
  triple("v:doc", "temperature", "v:temp")
)

Anti-pattern: Collect everything, then filter

Example: JavaScript

// Bad: materializes all readings into memory, then filters client-side
WOQL.collect(
  WOQL.var("temp"),
  WOQL.var("all_temps")
  WOQL.triple(WOQL.var("doc"), "temperature", WOQL.var("temp")),
)
// Then filter all_temps in application code

Instead, filter inside the sub-query:

Example: JavaScript

// Better: only collect what you need
WOQL.collect(
  WOQL.var("temp"),
  WOQL.var("daily_temps"),
  WOQL.and(
    WOQL.triple_slice(WOQL.var("doc"), "timestamp", WOQL.var("time"),
      "2025-01-01T00:00:00Z", "2025-01-02T00:00:00Z"),
    WOQL.triple(WOQL.var("doc"), "temperature", WOQL.var("temp"))
  ),
)

Pattern: Use `Member` and `Collect` as inverses

Example: JavaScript

// Generate a list from a query
WOQL.collect(
  WOQL.var("city"),
  WOQL.var("all_cities"),
  WOQL.and(
    WOQL.isa(WOQL.var("person"), "Person"),
    WOQL.triple(WOQL.var("person"), "city", WOQL.var("city"))
  ),
)

// Later: iterate over that list
WOQL.member(WOQL.var("city"), WOQL.var("unique_cities"))

Pattern: Streaming joins for responsive UIs

When your UI can display results incrementally, prefer streaming query shapes:

Example: JavaScript

// Streams results immediately — first row arrives fast
and(
  isa("v:person", "Person"),
  triple("v:person", "name", "v:name"),
  triple("v:person", "age", "v:age")
)

Over materializing shapes:

Example: JavaScript

// Must compute all results before returning any
order_by("v:age", "asc",
  and(
    isa("v:person", "Person"),
    triple("v:person", "name", "v:name"),
    triple("v:person", "age", "v:age")
  )
)

Both return the same data. The first starts streaming immediately; the second waits until all results are sorted.

Summary

Concept	Key takeaway
Backtracking	The engine explores all possibilities by going forward on success and backward on failure
Choice points	Non-deterministic predicates (`triple`, `member`, `or`, `sequence`) create branches
Variable scope	Variables are global by default; `select`, `group_by`, `Collect`, and `not` create scoped boundaries
Streaming vs. materializing	`triple`, `member`, `and` stream; `group_by`, `Collect`, `order_by` materialize
Performance	Use `triple_slice` for ranges, bind variables early, filter inside sub-queries
Member/Collect symmetry	`Member` unpacks lists into bindings; `Collect` packs bindings into lists

Understanding these mechanics lets you write WOQL queries that are not just correct, but efficient — queries that leverage the engine's strengths rather than fighting against them.

The Execution Model: Solutions and Backtracking

What is a solution?

How backtracking works

Deterministic vs. non-deterministic predicates

Variable Scope: Inner and Outer Queries

The default: all variables are globally scoped

select — restricting which variables escape

group_by — grouping with scoped aggregation

Collect — gathering bindings into a list (proposed)

member — unpacking a list into bindings

distinct — deduplication within a scope

not — negation as failure

opt — optional matching

Structuring Queries for Performance

Principle 1: Put the most selective constraint first

Principle 2: Use triple_slice for range constraints

Principle 3: Understand what streams and what materializes

Principle 4: Streaming responses — results flow as they complete

Principle 5: Leverage bound variables for index access

Principle 6: Compose Collect with list operators

Predicate Quick Reference: Control Flow Properties

Patterns and Anti-Patterns

Pattern: Narrow early, expand late

Anti-pattern: Collect everything, then filter

Pattern: Use Member and Collect as inverses

Pattern: Streaming joins for responsive UIs

Summary

Was this helpful?

`select` — restricting which variables escape

`group_by` — grouping with scoped aggregation

`Collect` — gathering bindings into a list (proposed)

`member` — unpacking a list into bindings

`distinct` — deduplication within a scope

`not` — negation as failure

`opt` — optional matching

Principle 2: Use `triple_slice` for range constraints

Principle 6: Compose `Collect` with list operators

Pattern: Use `Member` and `Collect` as inverses