An Minimalistic Introduction to einsum

This post introduces essence of the Einstein summation convention (Einsum).
Introduction
einsum
streamlines the following algebraic operations
1. matrix multiplication
2. summation over given dimensions
3. transpose
The numpy
function for the Einstein summation convention is
1 | np.einsum(subscripts, *operands) |
where subscripts
is a string defining the operation, and operands
are arrays for the operation.
Principles
The Einstein summation convention follows the following principles. In short
Inputs (left to the
->
) control multiplication. The output (right to the->
) controls summation and transpose.
Input arrays
- (P1) Repeated letters among input arrays: dimensions to be multiplied together
- (P2) If a label appears once, it is not summed.
- (P3) Repeated subscript labels in one operand take the diagonal.
The output array
- (P4, summation) Omitted letters of the output array: summing over values of the corresponding dimensions; On the contrary, dimensions corresponding to non-omitted letters are kept.
- (P5, order) In implicit mode (without
->
), the chosen subscripts are important since the axes of the output are reordered alphabetically. In explicit mode (with->
), the order of axes is defined by the subscripts
Examples
A
, B
: 1D arrays
Subscripts | Numpy | Principles |
---|---|---|
('i->', A) |
sum(A) |
P4 |
('i,i->i', A, B) |
A * B |
P1: repeating 'i' on the left → multiplication over that dimension P4: no omission → keeping the dimension |
('i,i', A, B) |
np.inner(A, B) |
P1: repeating 'i' on the left → multiplication over that dimension P4: omission → applying summation to collapse the dimension |
('i,j->ij', A, B) |
np.outer(A, B) |
P1: no repeating letters on the left → need to derive the multiplication dimension from the output → equivalent to 'i1,1j->ij' |
A
, B
: 2D arrays
Subscripts | Numpy | Principles |
---|---|---|
('ij', A) |
A |
P2, P5 |
('ji', A) |
A.T |
P2, P5: the subscript names the first and the second axis of A as j and i. In implicit mode, the axis order of the output array will be rearranged based on the alphabetical order of axes. Therefore, the second axis of A becomes the first axis of the output. |
('ij->', A) |
np.sum(A) |
P4 |
('ij->j', A) |
np.sum(A, axis=0) |
P4 |
('ij->i', A) |
np.sum(A, axis=1) |
P4 |
('ii', A) or ('ii->', A) |
np.trace(A) |
P3: 'ii' returns an array of size i; P4: the only named label i is missing from the output → summation |
('ii->i', A) |
np.diag(A) |
P3: 'ii' returns an array of size i; P4: the dimension i is kept |
('ij,ij->ij', A, B) |
A * B |
P1, P5 |
('ij,ji->ij', A, B) |
A * (B.T) |
P1: the first axis of A is multiplied by the second axis of B; the second axis of A is multiplied by the first axis of B. P5: the output is arranged in the way same as A. |
('ij,jk', A, B) |
np.dot(A, B) |
P1, P5 |
('ij,kj->ik', A, B) |
np.inner(A, B) (the inner function in higher dimensions executes a sum product over the last axes) |
P1, P4, P5 |
('ij,kj->ijk', A, B) |
A[:, None] * B |
A[:,None] is equivalent to i1j (inserting a dimension in the second position).P2, P5 |
('ij,kl->ijkl', A, B) |
A[:,:,None, None] * B |
P2, P5 |
Ellipsis (...)
Subscripts | Meaning |
---|---|
np.einsum('...ii->...i', A) |
Getting diagonal entries of the last two dimensions |
np.einsum('i...i', A) |
Take the trace along the first and the last axes |
np.einsum('ij...,jk...->ik...', A, B) |
Matrix multiplication iwth the left-most indices instead of rightmost |
References
- Post title:An Minimalistic Introduction to einsum
- Post author:Lutao Dai
- Create time:2022-05-03 16:02:00
- Post link:https://lutaodai.github.io/2022-05-03-minimalistic-intro-einsum/
- Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.