WOQL Architecture — AST Design & Security Model

This page explains why WOQL is designed the way it is — not how to write queries (see WOQL Basics for that), but the architectural decisions that make WOQL fundamentally different from string-based query languages. If you have ever dealt with SQL injection, parameterised queries, or ORM query-builder bugs, this context explains why those problems do not exist in WOQL.

The core idea: queries are data structures, not strings

Most query languages — SQL, Cypher, SPARQL — are defined as text grammars. You write a query as a string, send it to a server, and the server parses it. This creates a gap between the structure you intend and the text representation that reaches the server. That gap is where injection attacks, syntax errors, and composition bugs live.

WOQL takes a different approach. A WOQL query is an Abstract Syntax Tree (AST) — a structured JSON-LD object that the server receives directly. There is no parsing step, no string interpolation, and no text grammar. The client libraries provide builder functions that construct the AST programmatically:

Example: TypeScript

// This code builds a data structure — it does NOT build a string
import { triple, select } from "terminusdb"

const query = select(["v:Name", "v:Age"],
  triple("v:Person", "name", "v:Name"),
  triple("v:Person", "age", "v:Age")
)

The query variable now holds a JavaScript object — a tree of typed nodes — that serialises to JSON-LD for transmission:

Example: JSON

{
  "@type": "Select",
  "variables": ["Name", "Age"],
  "query": {
    "@type": "And",
    "and": [
      {
        "@type": "Triple",
        "subject": {"@type": "NodeValue", "variable": "Person"},
        "predicate": {"@type": "NodeValue", "node": "name"},
        "object": {"@type": "Value", "variable": "Name"}
      },
      {
        "@type": "Triple",
        "subject": {"@type": "NodeValue", "variable": "Person"},
        "predicate": {"@type": "NodeValue", "node": "age"},
        "object": {"@type": "Value", "variable": "Age"}
      }
    ]
  }
}

The server receives a typed tree, validates its structure against the WOQL schema, and executes it. No text parsing is involved at any point.

Security by construction

Why string-based queries are vulnerable

In SQL, a query is a string. If user input is concatenated into that string, the user controls the query structure:

Example: Python

# ❌ VULNERABLE: user input becomes part of query structure
username = request.params["user"]
query = f"SELECT * FROM users WHERE name = '{username}'"
# If username = "'; DROP TABLE users; --" → catastrophic

Parameterised queries (prepared statements) mitigate this by separating data from structure, but they require developer discipline. Every concatenation is a potential vulnerability. ORMs help but still generate strings internally.

Why WOQL cannot be injected

In WOQL, user input is always a value in a typed node — it can never become a structural element of the query:

Example: TypeScript

// ✅ SAFE: userInput is always a data value, never query structure
const userInput = request.params.user  // Even if malicious

const query = triple("v:Person", "name", userInput)

This produces:

Example: JSON

{
  "@type": "Triple",
  "subject": {"@type": "NodeValue", "variable": "Person"},
  "predicate": {"@type": "NodeValue", "node": "name"},
  "object": {"@type": "Value", "data": {"@type": "xsd:string", "@value": "'; DROP TABLE users; --"}}
}

The malicious string is trapped inside a data node. It cannot escape into the query structure because there is no string-to-structure boundary to cross. The AST enforces separation by construction — not by convention, not by developer discipline, but by the type system itself.

The security guarantee

Attack vector	String-based (SQL/Cypher/SPARQL)	WOQL (AST)
First-order injection	Possible (concatenation)	Impossible (no string boundary)
Second-order injection	Possible (stored strings reused in queries)	Impossible (values never become structure)
Parameter pollution	Possible in some ORMs	Impossible (typed nodes)
Operator injection	Possible (e.g., LIKE wildcards)	Impossible (operators are typed AST nodes)

This is not a claim that WOQL applications are invulnerable to all security issues. Authentication, authorisation, and business logic bugs exist regardless of query language. But the entire class of injection vulnerabilities — the most common database attack vector — is eliminated by architectural design.

Execution model

The execution boundary is clean and explicit:

Example: Text

┌──────────────────┐          JSON-LD           ┌──────────────────┐
│   Client (TS/Py) │  ──── AST over HTTP ────▶  │  TerminusDB      │
│                  │                             │  Server           │
│  Build AST       │                             │  Validate AST     │
│  Serialise JSON  │                             │  Execute query     │
│  Send request    │                             │  Return bindings   │
└──────────────────┘                             └──────────────────┘

Client side — The builder functions (triple(), select(), and()) construct an AST object in memory. No query execution happens on the client. The AST is serialised to JSON-LD and sent over HTTP.
Server side — TerminusDB receives the JSON-LD, validates it against the WOQL schema (checking that the AST is well-formed), then executes the query against the specified branch or commit. Results are returned as JSON bindings.
No local execution — The client never interprets or executes query logic. This means the server is the single point of query validation and access control. Client code cannot bypass server-side security checks.

What the server validates

When the server receives a WOQL AST, it checks:

Every node has a valid @type from the WOQL schema
Required fields are present (e.g., Triple must have subject, predicate, object)
Variable references are consistent
The target branch or commit exists and the user has read access

If any check fails, the query is rejected before execution. Malformed ASTs produce clear error messages rather than undefined behaviour.

Type safety across languages

Because WOQL's wire format is a JSON-LD schema, every client library — TypeScript, Python, Rust — generates the same AST structure. A query written in Python produces identical JSON-LD to the same query written in TypeScript:

TypeScript:

Example: TypeScript

const q = triple("v:X", "rdf:type", "@schema:Person")

Python:

Example: Python

q = WOQLQuery().triple("v:X", "rdf:type", "@schema:Person")

Both produce:

Example: JSON

{
  "@type": "Triple",
  "subject": {"@type": "NodeValue", "variable": "X"},
  "predicate": {"@type": "NodeValue", "node": "rdf:type"},
  "object": {"@type": "NodeValue", "node": "@schema:Person"}
}

This uniformity means:

Queries can be stored, shared, and replayed across languages
Server-side tooling needs only one query representation
Testing can verify the AST structure independent of client language
Query templates (stored as JSON-LD) work with any client

Comparison to other approaches

Aspect	SQL	GraphQL	Cypher	SPARQL	WOQL
Representation	Text grammar	Text grammar	Text grammar	Text grammar	JSON-LD AST
Injection risk	High (strings)	Low (typed schema)	Medium (strings)	Medium (strings)	None (by design)
Composition	String concat or ORM	Fragment merging	String concat	String concat	Function composition
Cross-language	Each DB has own dialect	Uniform spec	Neo4j-specific	Uniform spec	Same AST everywhere
Expressiveness	Full relational	Selection/mutation	Graph patterns	Full RDF	Full Datalog
Type checking	Runtime (DB rejects)	Schema validation	Runtime	Runtime	Client + server
Stored queries	Strings in DB	Persisted queries	Strings	Strings	JSON-LD documents

GraphQL comparison

GraphQL is the closest parallel — it also uses a typed schema to validate queries before execution. The key differences:

GraphQL validates query shape (which fields you request) but uses a text grammar that must be parsed
WOQL validates query structure (the AST itself) and never parses text
GraphQL is primarily for data fetching with limited expressiveness (no recursive traversal, aggregation, or complex logic)
WOQL is a full Datalog language with path queries, aggregation, unification, and mathematical operations

TerminusDB supports both GraphQL (for simple data fetching) and WOQL (for complex queries). See Choosing a Query Interface for when to use which.

Implications for application design

You can safely pass user input to WOQL builders

Because values cannot escape into structure, you can construct queries from user input without sanitisation:

Example: TypeScript

// Safe — searchTerm is always a value, never structure
const searchByName = (searchTerm: string) =>
  select(["v:Doc"],
    triple("v:Doc", "name", searchTerm)
  )

Queries are first-class data

Since queries are JSON-LD objects, you can:

Store them in TerminusDB itself (queries about queries)
Version them alongside your data (branch-aware query history)
Compose them dynamically at runtime without string manipulation
Serialise them for logging, debugging, or replay

No need for an ORM

ORMs exist primarily to provide safe query construction over string-based languages. Because WOQL's builders ARE the query language (not a wrapper around strings), there is no ORM layer to add, maintain, or debug. The builder IS the interface.

WOQL Architecture: AST Design & Security Model

The core idea: queries are data structures, not strings

Security by construction

Why string-based queries are vulnerable

Why WOQL cannot be injected

The security guarantee

Execution model

What the server validates

Type safety across languages

Comparison to other approaches

GraphQL comparison

Implications for application design

You can safely pass user input to WOQL builders

Queries are first-class data

No need for an ORM

Further reading

Was this helpful?