This article discusses the method of the paper "Query2Box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings" by Ren et al [1] and explains the intution behind the defined box operations and distance metrics.

Some concepts

Disjunctive Normal Form:
Conjunctive Normal Form:
Existential Positive First Order (EPFO) [2] logical queries that include any set of , and .

Knowledge Graphs and Conjunctive Queries

Figure 1. Query2Box reasoning framework. Source: [1]

Knowledge graphs

A knowledge graph (KG) is defined as , where is an entity and is a binary function , i.e., iff .

Conjunctive queries

The goal of answering the logical query is to find a set of entities $⟦ ⟧$ such that $⟦ ⟧$ iff .

where ;

or .

is a non-variable anchor entity, are existentially quantified bound variables.

Intuition: are potential answers. The final answer set/denotation set satisfies all imposed relation constraints .

Three types of directed edges

Projection operation: Given a set of entities and relation , the operator obtains , where . (Projection is the collection of tails/sink target)

Intersection operation: Given a set of entity sets , this operator obtains

Union operation: Given a set of entity sets , this operator obtains

Reasoning over Sets of Entities Using Box embeddings

The key idea is that it is possible to find a space and a (rectangular) box in that space enclosing all answers to the query.

The operation is in , and a box in is defined by

(centers and offsets along all dimensions)

where is element-wise inequality. is the positive offset of the box.

Figure 1(D) illustrates the reasoning process. The process maps source nodes (anchor entities) to boxes following the respective relation projections. The final answers are the intersection or union of those boxes.

Initial boxes for source nodes

They are regarded as an anchor entity , or a point in the space . Equivalently, they can be seen as a bounding box whose offset is 0:

Geometric projection operator

The relation is embedded as

with .

Given an input box embedding , they model the projection by

where they sum the centers and sum the offsets (Figure 2(A)).

Geometric intersection operator

They model the intersection of a set of box embeddings as

which is calculated by performing attention over the box centers and shrinking the box using the sigmoid function.

where

is the dimension-wise product
is the multi-layer perceptron
[3] is the permutation-invariant deep architecture.

Both and are applied elementwise.

They model all the deep sets by [4]:

The intuition is to generate a smaller box that lies inside a set of boxes (Figure 2(B), smaller: and ; inside: weighted sum of the centers)

Author claimed that the geometric intersection operator is more expressive and robust than raw box intersection.

Figure 2. The geometric intuition of the two operations and the distance function in Query2Box. Source: [1]

Figure 3. "Penalized-with-respect-to" plot.

Entity-to-box distance

The distance is the evaluation metric.

Given a query box and an entity vector , the distance is defined as

where is a fixed scalar and

Intuition of downweighing

As long as entity vectors are inside the box, it is regarded as “close enough”.

When , the reduces to the ordinary L1 distance

Intuition on distances: tl;dr

If the entity is inside the box, then , and is the L1 distance between and .

If the entity is located outside the box, then the the L1 distance to the closest corner of the boarder, and is a constant, whose value is the L1 distance between the center and any corner of the box.

Detailed discussion on and

The intuition of (Figure 3 & Figure 4): at any direction, the has the shape:

Figure 4. in any given direction

The intuition of :

If the entity is inside the box (equivalently located between and along a given directions), then , and . Then,

which is the L1 distance between and the center.

If the entity is outside the box,

and located to the right of , ,
and located to the left of , ,

Training objective

Negative sampling loss [5]

where represents a fixed scalar margin; $⟦ ⟧$ is a positive entity and $⟦ ⟧$ is the -th negative entity. is the number of negative entities.

Handling Disjunction with Disjunctive Normal Form

A challenge for the union operation

So far, a query defines a box projection and intersection id described by a geometric intersection operation on boxes. The two operations are simple and straightforward. An immediate challenge for union operation on boxes is that the union operation over boxes is not closed: 1. The union of boxes may not be a box 2. The union of boxes may be a collection of discontinuous regions. If we define a bigger box that encapsulates all those regions, we may be forced to include entities that are not supposed to be included.

Disjunctive Normal Form (DNF)

Any first-order logic can be transformed to the equivalent DNF [6]. Essentially, this means union operation is moved to the last step (Figure 5).

Figure 5. Converting a computation graph of an EPFO query into an equivalent computation graph of DNF. Source: [1]

Distance metric with union operator

Since is logically equivalent to , the aggregated distance function can be naturally defined as

when is a conjunctive query, i.e.,

For , takes the minimum distance to the closest box as the distance to an entity. The intuition of the "Min" operator is that an entity is inside the union of sets as long as the entity is in one of the sets.

References

[1] N. Dalvi and D. Suciu, "The dichotomy of probabilistic inference for unions of conjunctive queries," Journal of the ACM (JACM), vol. 59, no. 6, pp. 1-87, 2013.

[2] H. Ren, W. Hu, and J. Leskovec, "Query2box: Reasoning over knowledge graphs in vector space using box embeddings," arXiv preprint arXiv:2002.05969, 2020.

[3] M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. Salakhutdinov, and A. Smola, "Deep sets," arXiv preprint arXiv:1703.06114, 2017.

[4] W. L. Hamilton, P. Bajaj, M. Zitnik, D. Jurafsky, and J. Leskovec, "Embedding logical queries on knowledge graphs," arXiv preprint arXiv:1806.01445, 2018.

[5] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.

[6] B. A. Davey and H. A. Priestley, Introduction to lattices and order. Cambridge university press, 2002.