An Minimalistic Introduction to einsum
Lutao Dai

This post introduces essence of the Einstein summation convention (Einsum).



Introduction

einsum streamlines the following algebraic operations

1. matrix multiplication
2. summation over given dimensions
3. transpose

The numpy function for the Einstein summation convention is

1
np.einsum(subscripts, *operands)

where subscripts is a string defining the operation, and operands are arrays for the operation.



Principles

The Einstein summation convention follows the following principles. In short

Inputs (left to the ->) control multiplication. The output (right to the ->) controls summation and transpose.

Input arrays

  1. (P1) Repeated letters among input arrays: dimensions to be multiplied together
  2. (P2) If a label appears once, it is not summed.
  3. (P3) Repeated subscript labels in one operand take the diagonal.

The output array

  1. (P4, summation) Omitted letters of the output array: summing over values of the corresponding dimensions; On the contrary, dimensions corresponding to non-omitted letters are kept.
  2. (P5, order) In implicit mode (without ->), the chosen subscripts are important since the axes of the output are reordered alphabetically. In explicit mode (with ->), the order of axes is defined by the subscripts


Examples

A, B: 1D arrays

Subscripts Numpy Principles
('i->', A) sum(A) P4
('i,i->i', A, B) A * B P1: repeating 'i' on the left → multiplication over that dimension
P4: no omission → keeping the dimension
('i,i', A, B) np.inner(A, B) P1: repeating 'i' on the left → multiplication over that dimension
P4: omission → applying summation to collapse the dimension
('i,j->ij', A, B) np.outer(A, B) P1: no repeating letters on the left → need to derive the multiplication dimension from the output → equivalent to 'i1,1j->ij'

A, B: 2D arrays

Subscripts Numpy Principles
('ij', A) A P2, P5
('ji', A) A.T P2, P5: the subscript names the first and the second axis of A as j and i. In implicit mode, the axis order of the output array will be rearranged based on the alphabetical order of axes. Therefore, the second axis of A becomes the first axis of the output.
('ij->', A) np.sum(A) P4
('ij->j', A) np.sum(A, axis=0) P4
('ij->i', A) np.sum(A, axis=1) P4
('ii', A) or ('ii->', A) np.trace(A) P3: 'ii' returns an array of size i; P4: the only named label i is missing from the output → summation
('ii->i', A) np.diag(A) P3: 'ii' returns an array of size i; P4: the dimension i is kept
('ij,ij->ij', A, B) A * B P1, P5
('ij,ji->ij', A, B) A * (B.T) P1: the first axis of A is multiplied by the second axis of B; the second axis of A is multiplied by the first axis of B. P5: the output is arranged in the way same as A.
('ij,jk', A, B) np.dot(A, B) P1, P5
('ij,kj->ik', A, B) np.inner(A, B)(the inner function in higher dimensions executes a sum product over the last axes) P1, P4, P5
('ij,kj->ijk', A, B) A[:, None] * B A[:,None] is equivalent to i1j (inserting a dimension in the second position).
P2, P5
('ij,kl->ijkl', A, B) A[:,:,None, None] * B P2, P5

Ellipsis (...)

Subscripts Meaning
np.einsum('...ii->...i', A) Getting diagonal entries of the last two dimensions
np.einsum('i...i', A) Take the trace along the first and the last axes
np.einsum('ij...,jk...->ik...', A, B) Matrix multiplication iwth the left-most indices instead of rightmost


References

  • Post title:An Minimalistic Introduction to einsum
  • Post author:Lutao Dai
  • Create time:2022-05-03 16:02:00
  • Post link:https://lutaodai.github.io/2022-05-03-minimalistic-intro-einsum/
  • Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.