5* Knowledge Graph Embeddings with Projective Transformations @AAAI2021
In this post, we would like to present our recent contribution in knowledge graphs embedding (KGE) models which was accepted at the AAAI 2021 conference.
Knowledge Graph Embeddings
Knowledge Graphs [1] are used by many organisations to store and structure relevant information. In knowledge graphs, entities are represented by nodes and relationships are represented by edges. In the example here, you can, for instance, see that Carl Gotthard Langhans is the architect of Brandenburg Gate.
Most current machine learning methods require an input in the form of features, which means they cannot directly use a graph as input. For this reason, we aim to learn feature representations, so-called embeddings, of entities and relations in the knowledge graph.
Limitations of Existing Embedding Models
Performing link prediction using knowledge graph embedding models has become a popular approach for knowledge graph completion. Such models employ a transformation function that maps nodes via edges into a vector space in order to measure the likelihood of the links. While mapping the individual nodes, the structure of subgraphs is also transformed. Most of the existing KGEs have been designed in Euclidean geometry and usually support a single transformation type — often translation or rotation.
This limits the ability of KGE models in embedding KGs with complex motifs in subgraphs, especially when multiple structures exist in a neighborhood. An example of this situation is the presence of a path structure for a group of nodes close to a loop structure of another group in a KG (illustrated in the figure below).
Projective Transformation Functions
Projective transformations include simultaneous transformations covering: translation, rotation, homothety, reflection and inversion and is the foundation of our approach.
As you can see, there is a sphere laid on a plane. For each point on the sphere (except the north pole), there is a counterpart on the complex plane which is obtained by projecting from the sphere to the plane. The points which are closer to the north pole are projected to a longer distance and the north pole is projected to the infinity on the plane.
Here, sequences of points (on the sphere or plane) form flows. The transition from one point to another point is performed by the projective transformation. The projective transformation is a parametric function. Depending on the values of the parameters, the projective transformation gives various shapes to the vector space (shape of flows) namely Circular, Elliptic, Hyperbolic, Parabolic and Loxodromic.
In the case of Circular, the projective transformation has one fixed point (here the fixed point is in the origin). The flows have circular shapes around the fixed point. In the case of elliptic, the projective transformation has two fixed points and the flows are circulated around each of them. For Hyperbolic, again there are two fixed points, but here the flows are sourced from one fixed point and sank to another one. Parabolic contains one fixed point (here on the north pole) and the flows on the sphere start from the point and after passing along the sphere, they back to the fixed point. Note that here, the flows on the sphere become straight lines on the plane. In the case of Loxodromic, there are two fixed points on the sphere (one fixed point is close to the north pole on the sphere and the other is close to the south pole). The flows are sourced from one fixed point and sunk to the other, spirally. After projecting the flows from sphere to plane, one fixed point is close to origin on the plane and the other is projected in a longer distance. Therefore, on the plane, the flows pass a long distance in a spiral way from one fixed point to another one.As seen, the projective transformation has flexibility to give different shape to the landscape of the vector space and consequently encode much more structural information in the vector space.
Formulation of Projective Transformation
Project geometry is the type of geometry in which parallel lines intersect at infinity, which also corresponds to how we humans see the world. In our new model 5-star, we use complex numbers with real and imaginary parts in projective geometry. I’ll show you here how the transformation is done. A complex number is represented by a point in the complex plane you can see at the bottom here. The transformation works by first projecting a point in the complex plane to a point in the sphere. This is the so-called Riemann Sphere, which is a representation for the complex numbers extended by infinity. We then move the sphere to a new position in the second step and in the third step project the result back to the complex plane.
5*E Knowledge Graph Embedding
Below,we show the stereographic projection to map nodes. Relation specific transformation performed by 5*E can be represented as a composition of three transformations: First, the original head embedding is mapped from the complex plane to the Riemann sphere shown in the right side of the figure. The sphere is then moved to a new position shown by M. The point on the moved sphere is then mapped from the sphere to the complex plane by using the stereographic projection. Therefore, in order to map head to tail entities, a more complex transformation is performed which enables capturing more complex structures.
The relation specific transformation in 5*E can be represented as composition of translation, rotation, homothety, reflection and inversion. As you see, the left side is a complex plane and the right side is the plane with a Riemann sphere shown in 3D space. The translation is performed by moving the sphere from one position to another one. Homothety is performed by moving the sphere along the z axis which results in change in the size of the shape in the complex plane. The rotation of shape in the plane is performed by rotating the sphere around the z axis. Inversion and reflection are shown by rotation of the sphere around the real or imaginary axis.
Evaluation and Results
We analysed the behavior of 5*E both in entity level and relation level. We captured the changes for inverse relations as you see. We fed a grid into the learned transformation by 5*E for partof and haspart relations as well as hypernym and hyponym relations which are in inverse relation. Lines mapped to circles and the transformation learned by each of pairs are inverse of each other.
Here are example visualizations for transformations in a KG. On the left, you can see relation embeddings. The top shows a grid of complex numbers. Any point on the grid can be one dimension of an entity embedding. Below, you can see how those points are transformed via the relation specific transformation. This image is taken from the WordNet knowledge graph and you can see the inverse relations hypernym and hyponym here. Inverse means that whenever A is a hypernym of B, then B is a hyponym of A. What is very nice here is that this inversion pattern is modelled as a reflection in the embedding space as you can see. This indicates that the model can preserve this inverse relationship in the embedding space
On the right side, you can see the embeddings of entities at the top. Those are transformed to the embeddings at the bottom. The interesting aspect here is that lines are transformed to circles and circles to lines. No other model can do this and this influences the structure preservation in the knowledge graph.
Apart from the performance in practice, a nice thing is that we could also prove some formal properties. We could show that the model is fully expressive. A model is fully expressive if it can accurately separate existing and non-existing triples for an arbitrary KG.
We could also show that the model is capable of inferring several relational patterns, in particular rule composition, inverse rules and symmetric rules. Inference here means that when the premise is true according to the score function of the model, then the conclusion is also true.
Last, but not least, we could show that the model subsumes various state-of-the-art models. Subsumes here means that any scoring for an arbitrary KG of a model can be achieved by the more general model. Essentially, it means that all other models are special cases of the proposed model.
Link to Paper: http://jens-lehmann.org/files/2021/aaai_5StarE.pdf