A Query Language
Abstract
Datomic’s pull queries
can be a convenient, batteries-included alternative to
GraphQL. We show how. Then we make
up an excuse to write a slightly extended version of the pull
language, using core.spec.
- [2019-12-23] This note is superseeded by the PullQL project.
Declarative query languages have found new popularity as a means of serving the highly varying needs of an increasingly diverse set of clients (TVs and game consoles, smartphones of manifold size, shape, and capability, desktop computers, VR headsets, and even watches). GraphQL is probably the best known effort in this space (and the only one known to me for that matter, besides Netflix’s Falcor). Let’s have a look at a GraphQL expression.
{
human(id: 1002) {
name
starships {
name
class
}
}
}
This will fetch the name of the human with id 1002, as well as their associated starships. For every associated starship, its name and ship class will be contained in the response as well. Note that both GraphQL as well as pull queries must specify a set of root entities, on which to resolve the expression (the set of humans in this case). Translating the above into Clojure, we could end up with something like
{:human [:human/name
{:human/starships [:ship/name
:ship/class]}]}
This is the tiniest of additions on top of plain pull queries. Instead
of passing a query directly, clients now pass a map from root
collections to pull queries. The server can simply fetch all entity
ids in the specified root collection, and run d/pull-many
.
So far so good, but we have lost the capability to filter the human collection by id, like we could in the GraphQL example. Expressing such a filter could be done very directly:
{(human {:db/id 1002}) [:human/name
{:human/starships [:ship/name
:ship/class]}]}
But in order to keep things fresh and interesting, we will do it in a bit more general way, that resembles Datalog’s query clauses more closely:
{:human [[:db/id 1002]
:human/name
{:human/starships [:ship/name
:ship/class]}]}
This even allows us to filter on nested relations as well, something that (to my knowledge) GraphQL does not support at the moment:
{:human [[:db/id 1002]
:human/name
{:human/starships [[:ship/name "Anubis"]
:ship/class]}]}
Parsing
We can use core.spec to define a grammar for this language and get a parser for free — a party trick I’ve picked up from a great talk by Stuart Sierra.
;; a rough grammar
(s/def ::query (s/map-of keyword? ::pattern))
(s/def ::pattern (s/coll-of ::attr-spec))
(s/def ::attr-spec (s/or :attribute ::attr-name
:clause ::clause
:expand ::map-spec))
(s/def ::attr-name keyword?)
(s/def ::clause (s/or :data-pattern ::data-pattern))
(s/def ::data-pattern (s/tuple ::attr-name ::constant))
(s/def ::constant (constantly true)) ;; @TODO
(s/def ::map-spec (s/and (s/map-of ::attr-name ::pattern)
#(= (count %) 1)))
;; a 'parser'
(defn parse [query]
(let [conformed (s/conform ::query query)]
(if (s/invalid? conformed)
(throw (ex-info "Couldn't parse query" (s/explain-data ::query query)))
conformed)))
If we try this out in a REPL, we obtain the following parse tree:
(parse '{:human [[:db/id 1002]
:human/name
{:starships [[:ship/name "Anubis"]
:ship/class]}]})
;; => {:human
;; [[:clause [:data-pattern [:db/id 1002]]]
;; [:attribute :human/name]
;; [:expand
;; {:starships
;; [[:clause [:data-pattern [:ship/name "Anubis"]]]
;; [:attribute :ship/class]]}]]})
Interpretation
As easy as it was, parsing is only half of the game. We now need to interpret expressions in our new language and execute them atop a Datomic database. In its simplest form, the interpreter will keep track of some interpretation context as it walks the parse tree. We can specify the interpretation of the various patterns supported by our language, using a multi-method.
(defmulti impl (fn [ctx node] (first node)))
Our interpretation context is made up of a Datomic database value, a set of datoms storing query results, and finally the set of entity id’s that we’re interested in. The latter is initialized with the set of root entities, as discussed above.
(defn pull [db pattern eids]
(let [ctx {:db db
:datoms #{}
:eids eids}]
(impl ctx [:pattern pattern])))
The simplest construct is the :pattern
, which from the grammar above
we remember as simply a collection of other constructs. Therefore the
interpretation of a :pattern
is precisely the combined
interpretation of all of its constituent constructs.
(defmethod impl :pattern [ctx [_ specs]]
(reduce impl ctx specs))
For :attribute
constructs, we must fetch the desired attribute for
all of the entities that the client is interested in. This can be done
efficiently by accessing Datomic’s indices directly, using the
d/datoms
function. For this lookup, we will access the :aevt
index, which keeps datoms sorted by the attribute they specify.
(defmethod impl :attribute [{:keys [db eids] :as ctx} [_ attr]]
(let [datoms (into #{}
(filter (fn [datom] (contains? eids (.-e datom))))
(d/datoms db :aevt attr))]
(update ctx :datoms set/union datoms)))
Expand constructs traverse relations. Consequently, we must fetch the related entities and pull them recursively. We will then add them alongside the resulting datoms into our interepretation context:
(defmethod impl :expand [{:keys [db eids] :as ctx} [_ map-spec]]
(let [[attr pattern] (first map-spec)
children (into #{}
(filter (fn [datom] (contains? eids (.-e datom))))
(d/datoms db :aevt attr))
children-eids (into #{} (map :v) children)
children-ctx (pull db pattern children-eids)]
(-> ctx
(update :eids set/union (:eids children-ctx))
(update :datoms set/union (into #{} (filter (fn [datom]
(contains? (:eids children-ctx) (.-v datom))) children)))
(update :datoms set/union (:datoms children-ctx)))))
The final supported construct is the :clause
. It filters the set of
entities that the client is interested in, to include only those
matching the specified value. Again we make use of direct index access
via d/datoms
. For some, indexed
attributes, Datomic will maintain
an additional index called :avet
, which keeps datoms sorted by
attribute and value. For all other attributes, we will have to do the
filtering ourselves.
(defmethod impl :clause [{:keys [db eids] :as ctx} [_ clause]]
(let [[_ data-pattern] clause
[attr v] data-pattern
indexed? (:indexed (d/attribute db attr))
matching-datoms (if indexed?
(into #{} (d/datoms db :avet attr v))
(into #{}
(filter (fn [datom] (= (.-v datom) v)))
(d/datoms db :aevt attr)))]
(-> ctx
(update :datoms set/union matching-datoms)
(update :eids set/intersection (into #{} (map :e) matching-datoms)))))
Putting it all together, we first need a way of resolving the specified set of root entities…
(defn resolve-root [db root]
(case root
:human (set (d/q '[:find [?e ...] :where [?e :human/name _]] db))))
…as well as a way of extracting the resulting set of datoms from the final interpretation context and collecting them into a tree.
(defn resolve-query [db query]
(->> query
(parse)
(reduce-kv
(fn [result root pattern]
(let [root-eids (resolve-root db root)
ctx (pull db pattern root-eids)
eids (:eids ctx)
entities (->> (:datoms ctx)
(reduce
(fn [tree datom]
(if-not (contains? eids (.-e datom))
tree
(let [attr (d/attribute db (.-a datom))
ref? (= (:value-type attr) :db.type/ref)]
(if ref?
(update-in tree [(.-e datom) (:ident attr)] conj (.-v datom))
(assoc-in tree [(.-e datom) (:ident attr)] (.-v datom)))))) {}))
hydrate (fn [eid]
(->> (get entities eid)
(reduce-kv
(fn [entity a v]
(let [attr (d/attribute db a)
ref? (= (:value-type attr) :db.type/ref)]
(if-not ref?
(assoc entity a v)
(if (coll? v)
(assoc entity a (mapv entities v))
(assoc entity a (get entities v)))))) {})))
;; hydrate
tree (->> root-eids
(into [] (comp (map hydrate) (remove empty?))))]
(assoc result root tree))) {})))
And at last, we can try it out in the REPL.
(resolve-query
(d/db conn)
'{:human [:human/name
{:human/starships [:ship/name
[:ship/class :ship.class/fighter]]}]})
;; => {:human
;; [#:human{:name "Naomi Nagata",
;; :starships
;; [#:ship{:name "Roci", :class :ship.class/fighter}]}
;; #:human{:starships
;; [#:ship{:name "Roci", :class :ship.class/fighter}],
;; :name "Amos Burton"}]}
All code is available on Github.