
This article discusses the method of the paper "Query2Box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings" by Ren et al [1] and explains the intution behind the defined box operations and distance metrics.
Some concepts
- Disjunctive Normal Form:
- Conjunctive Normal Form:
- Existential Positive First Order (EPFO) [2] logical queries that include any set of
, and .
Knowledge Graphs and Conjunctive Queries
Figure 1. Query2Box reasoning framework. Source: [1]
Knowledge graphs
A knowledge graph (KG) is defined as
Conjunctive queries
The goal of answering the logical query
where
or
Intuition:
Three types of directed edges
Projection operation: Given a set of entities
Intersection operation: Given a set of entity sets
Union operation: Given a set of entity sets
Reasoning over Sets of Entities Using Box embeddings
The key idea is that it is possible to find a space and a (rectangular) box in that space enclosing all answers to the query.
The operation is in
(centers and offsets along all dimensions)
where
Figure 1(D) illustrates the reasoning process. The process maps source nodes (anchor entities) to boxes following the respective relation projections. The final answers are the intersection or union of those boxes.
Initial boxes for source nodes
They are regarded as an anchor entity
Geometric projection operator
The relation
with
Given an input box embedding
where they sum the centers and sum the offsets (Figure 2(A)).
Geometric intersection operator
They model the intersection of a set of box embeddings
which is calculated by performing attention over the box centers and shrinking the box using the sigmoid function.
where
is the dimension-wise product is the multi-layer perceptron [3] is the permutation-invariant deep architecture.
Both
They model all the deep sets by [4]:
The intuition is to generate a smaller box that lies inside a set of boxes (Figure 2(B), smaller:
Author claimed that the geometric intersection operator is more expressive and robust than raw box intersection.
Figure 2. The geometric intuition of the two operations and the distance function in Query2Box. Source: [1]
Figure 3. "Penalized-with-respect-to" plot.
Entity-to-box distance
The distance is the evaluation metric.
Given a query box
where
Intuition of downweighing
As long as entity vectors are inside the box, it is regarded as “close enough”.
When
Intuition on distances: tl;dr
If the entity is inside the box, then
If the entity is located outside the box, then the
Detailed discussion on
The intuition of
Figure 4.
The intuition of
If the entity
which is the L1 distance between
If the entity
- and located to the right of
, , - and located to the left of
, ,
Training objective
Negative sampling loss [5]
where
Handling Disjunction with Disjunctive Normal Form
A challenge for the union operation
So far, a query defines a box projection and intersection id described by a geometric intersection operation on boxes. The two operations are simple and straightforward. An immediate challenge for union operation on boxes is that the union operation over boxes is not closed: 1. The union of boxes may not be a box 2. The union of boxes may be a collection of discontinuous regions. If we define a bigger box that encapsulates all those regions, we may be forced to include entities that are not supposed to be included.
Disjunctive Normal Form (DNF)
Any first-order logic can be transformed to the equivalent DNF [6]. Essentially, this means union operation is moved to the last step (Figure 5).
Figure 5. Converting a computation graph of an EPFO query into an equivalent computation graph of DNF. Source: [1]
Distance metric with union operator
Since
when
For
References
[1] N. Dalvi and D. Suciu, "The dichotomy of probabilistic inference for unions of conjunctive queries," Journal of the ACM (JACM), vol. 59, no. 6, pp. 1-87, 2013.
[2] H. Ren, W. Hu, and J. Leskovec, "Query2box: Reasoning over knowledge graphs in vector space using box embeddings," arXiv preprint arXiv:2002.05969, 2020.
[3] M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. Salakhutdinov, and A. Smola, "Deep sets," arXiv preprint arXiv:1703.06114, 2017.
[4] W. L. Hamilton, P. Bajaj, M. Zitnik, D. Jurafsky, and J. Leskovec, "Embedding logical queries on knowledge graphs," arXiv preprint arXiv:1806.01445, 2018.
[5] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.
[6] B. A. Davey and H. A. Priestley, Introduction to lattices and order. Cambridge university press, 2002.
- Post title:Query2Box
- Post author:Lutao Dai
- Create time:2022-03-20 16:09:00
- Post link:https://lutaodai.github.io/2022-03-20-query2box/
- Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.