This post introduces essence of the Einstein summation convention (Einsum).

Introduction

einsum streamlines the following algebraic operations

1. matrix multiplication
2. summation over given dimensions
3. transpose

The numpy function for the Einstein summation convention is

1	np.einsum(subscripts, *operands)

where subscripts is a string defining the operation, and operands are arrays for the operation.

Principles

The Einstein summation convention follows the following principles. In short

Inputs (left to the ->) control multiplication. The output (right to the ->) controls summation and transpose.

Input arrays

The output array

(P4, summation) Omitted letters of the output array: summing over values of the corresponding dimensions; On the contrary, dimensions corresponding to non-omitted letters are kept.
(P5, order) In implicit mode (without ->), the chosen subscripts are important since the axes of the output are reordered alphabetically. In explicit mode (with ->), the order of axes is defined by the subscripts

Subscripts	Numpy	Principles
`('i->', A)`	`sum(A)`	P4
`('i,i->i', A, B)`	`A * B`	P1: repeating 'i' on the left → multiplication over that dimension P4: no omission → keeping the dimension
`('i,i', A, B)`	`np.inner(A, B)`	P1: repeating 'i' on the left → multiplication over that dimension P4: omission → applying summation to collapse the dimension
`('i,j->ij', A, B)`	`np.outer(A, B)`	P1: no repeating letters on the left → need to derive the multiplication dimension from the output → equivalent to 'i1,1j->ij'

Subscripts	Numpy	Principles
`('ij', A)`	`A`	P2, P5
`('ji', A)`	`A.T`	P2, P5: the subscript names the first and the second axis of A as j and i. In implicit mode, the axis order of the output array will be rearranged based on the alphabetical order of axes. Therefore, the second axis of A becomes the first axis of the output.
`('ij->', A)`	`np.sum(A)`	P4
`('ij->j', A)`	`np.sum(A, axis=0)`	P4
`('ij->i', A)`	`np.sum(A, axis=1)`	P4
`('ii', A)` or `('ii->', A)`	`np.trace(A)`	P3: 'ii' returns an array of size i; P4: the only named label i is missing from the output → summation
`('ii->i', A)`	`np.diag(A)`	P3: 'ii' returns an array of size i; P4: the dimension i is kept
`('ij,ij->ij', A, B)`	`A * B`	P1, P5
`('ij,ji->ij', A, B)`	`A * (B.T)`	P1: the first axis of A is multiplied by the second axis of B; the second axis of A is multiplied by the first axis of B. P5: the output is arranged in the way same as A.
`('ij,jk', A, B)`	`np.dot(A, B)`	P1, P5
`('ij,kj->ik', A, B)`	`np.inner(A, B)`(the inner function in higher dimensions executes a sum product over the last axes)	P1, P4, P5
`('ij,kj->ijk', A, B)`	`A[:, None] * B`	`A[:,None]` is equivalent to i1j (inserting a dimension in the second position). P2, P5
`('ij,kl->ijkl', A, B)`	`A[:,:,None, None] * B`	P2, P5

Subscripts	Meaning
`np.einsum('...ii->...i', A)`	Getting diagonal entries of the last two dimensions
`np.einsum('i...i', A)`	Take the trace along the first and the last axes
`np.einsum('ij...,jk...->ik...', A, B)`	Matrix multiplication iwth the left-most indices instead of rightmost

References