# A peek into the world of tensors

Standard

When I was in high school, two cool things that I learned from physics were “vectors” and “calculus”. I was (and am still) awestruck by following statement

Uniform circular motion is an accelerated motion.

At college, during my first (and second last) physics course, I was taught “vector calculus” and I didn’t enjoy it. Last year I learned “linear algebra”, that is the study of vector spaces, matrices, linear transformations… Also, a few months ago I wrote about my understanding of Algebra. In it I briefly mentioned that “…study of symmetry of equations, geometric objects, etc. became one of the central topics of interest…” and this lead to what we call “Abstract Algebra” of which linear algebra is a part. Following video by 3blue1brown explains how our understanding of vectors from physics can be used to develop the subject of linear algebra.

But, one should ask : “Why do we care to classify physical quantities as scalars and vectors?”. The answer to this question lies in the quest of physics to find the invariants, in terms of which we can state the Laws of nature. In general, the idea of finding invariants is a useful problem solving strategy in mathematics (the language of physics). For example, consider following problem from the book “Problem-Solving Strategies” by Arthur Engel:

Suppose the positive integer n is  odd. The numbers 1,2,…, 2n are written on the blackboard. Then one can pick any two numbers a and b, erase them and write instead |a-b|. Prove that an odd number will remain at the end.

To prove this statement, one will have to use the concept of “parity” of the sum 1+2+3+…+2n as the invariant. And as stated in the video above, vectors are invariant under transformations of coordinate systems (the components change, but length and direction of arrow remain unchanged). For example, consider the rotation of 2D axis by angle θ, keeping the origin fixed.

By Guy vandegrift (Own work) [CC BY-SA 3.0], via Wikimedia Commons

$\displaystyle{x' = x \cos \theta + y \sin \theta; \quad y'= -x \sin \theta + y \cos \theta}$

Now, we can rewrite this by using $x_1$ and $x_2$ instead of $x$ and $y$; and putting different subscripts on the single letter $a$ instead of functions of $\theta$:

$\displaystyle{x_1' = a_{11} x_1 + a_{12}x_2; \quad x_2'= a_{21}x_1 + a_{22}x_2}$

Now we differentiate this system of equations to get:

$\displaystyle{dx_1'= \frac{\partial x_1'}{\partial x_1} dx_1 + \frac{\partial x_1'}{\partial x_2} dx_2 ; \quad dx_2' = \frac{\partial x_2'}{\partial x_1} dx_1 + \frac{\partial x_2'}{\partial x_2} dx_2 }$

where $a_{ij} = \frac{\partial x_i'}{\partial x_j}$. We can rewrite this system in condensed form as:

$\displaystyle{dx_{\mu}' = \sum_{\sigma} \frac{\partial x_{\mu}'}{\partial x_{\sigma}} dx_{\sigma}}$

for $\mu =1,2$ and $\sigma =1,2$. We can further abbreviate it by omitting the summation symbol $\sum_{\sigma}$ with the understanding that whenever a subscript occurs twice in a single term, we do summation on that subscript.

$\displaystyle{\boxed{dx_{\mu}' = \frac{\partial x_{\mu}'}{\partial x_{\sigma}} dx_{\sigma} }}$

for $\mu =1,2$ and $\sigma =1,2$. This equation represents ANY transformation of coordinates whenever the values of $(x_{\sigma})$ and $(x_{\mu}')$ are in one-to-one correspondence. Moreover, it can be extended to represent transformation of coordinates of any n-dimensional vector. For example, if  $\mu =1,2,3$ and $\sigma =1,2,3$ then it represents coordinate  transformations of a 3-dimensional vector.

But, there are physical quantities which can’t be classified as scalar or vector.  For example, “stress”: the internal force experienced by a material due to the “strain” caused by external force; is described as a “tensor of rank 2”. This is so, because the stress at any point on the surface depends upon the external force vector and area vector i.e. it describes things happening due to interaction between two vectors. The Cauchy stress tensor $\boldsymbol{\sigma}$ consists of nine components $\sigma_{ij}$ that completely define the state of stress at a point inside a material in the deformed state (where i corresponds to the force component direction and j corresponds to the area component direction). The tensor relates a unit-length direction vector  n to the stress vector  $\mathbf{T}^{(\mathbf{n})}$ across an imaginary surface perpendicular to n:

$\displaystyle{\mathbf{T}^{(\mathbf n)}= \mathbf n \cdot\boldsymbol{\sigma}\quad \text{or} \quad T_j^{(n)}= \sigma_{ij}n_i}$        where,  $\boldsymbol{\sigma} = \left[{\begin{matrix} \mathbf{T}^{(\mathbf{e}_1)} \\ \mathbf{T}^{(\mathbf{e}_2)} \\ \mathbf{T}^{(\mathbf{e}_3)} \\ \end{matrix}}\right] = \left[{\begin{matrix} \sigma _{11} & \sigma _{12} & \sigma _{13} \\ \sigma _{21} & \sigma _{22} & \sigma _{23} \\ \sigma _{31} & \sigma _{32} & \sigma _{33} \\ \end{matrix}}\right]$
where $\sigma_{11}$, $\sigma_{22}$ and $\sigma_{33}$ are normal stresses, and $\sigma_{12}$, $\sigma_{13}$, $\sigma_{21}$, $\sigma_{23}$, $\sigma_{31}$ and $\sigma_{32}$ are shear stresses. We can represent the stress vector acting on a plane with normal unit vector n, as:

By Sanpaz (Own work) [CC BY-SA 3.0 or GFDL], via Wikimedia Commons

Here, the tetrahedron is formed by slicing a parallelepiped along an arbitrary plane n. So, the force acting on the plane n is the reaction exerted by the other half of the parallelepiped and has an opposite sign.

In this terminology, a scalar is a tensor of rank zero and a vector is a tensor of rank one. Moreover, in an n-dimensional space:

• a vector has n components
• a tensor of rank two has n^2 components
• a tensor of rank three has n^3 components
• and so on …

Just like vectors, tensors in general are invariant under transformations of coordinate systems. We wish to exploit this further. Let’s reconsider the boxed equation stated earlier.  Since we are working with Euclidean metric i.e the length s of vector is given by $s^2=x_1^2+x_2^2$, we have $ds^2=dx_1^2+dx_2^2$ i.e. $dx_1$ and $dx_2$ are the components of $ds$. So, replacing $dx_1$ and $dx_2$ by $A^1$ and $A^2$  we get (motivation is to capture the idea of area vector)

$\displaystyle{\boxed{A^{' \mu} = \frac{\partial x_{\mu}'}{\partial x_{\sigma}} A^{\sigma}}}$

where $A^1, A^2, A^3, \ldots$  are components of a vector in certain coordinate system (note that superscripts are just for indexing purposes and do NOT represent exponents). Any set of quantities which transforms according to this equation is defined to be a contravariant vector . Moreover, we can generalize this equation to a tensor of any rank.  For example, a contravariant tensor of rank two is defined by:

$\displaystyle{A^{' \alpha \beta} = \frac{\partial x_{\alpha}'}{\partial x_{\gamma}} \frac{\partial x_{\beta}'}{\partial x_{\delta}} A^{\gamma \delta}}$

where the sum is over the indices $\gamma$ and $\delta$ (since they occur twice in the term on right). We can illustrate this for 3 dimensional space, i.e. $\alpha , \beta , \gamma , \delta = 1,2,3$ but summation performed only on  $\gamma$  and  $\delta$; for instance, if  $\alpha=1$ and  $\beta=2$ then we have:

$\displaystyle{A^{' 12} = \frac{\partial x_{1}'}{\partial x_{1}} \frac{\partial x_{2}'}{\partial x_{1}} A^{11} + \frac{\partial x_{1}'}{\partial x_{1}} \frac{\partial x_{2}'}{\partial x_{2}} A^{12} + \frac{\partial x_{1}'}{\partial x_{1}} \frac{\partial x_{2}'}{\partial x_{3}} A^{13}+ \frac{\partial x_{1}'}{\partial x_{2}} \frac{\partial x_{2}'}{\partial x_{1}} A^{21} + \frac{\partial x_{1}'}{\partial x_{2}} \frac{\partial x_{2}'}{\partial x_{2}} A^{22} + \frac{\partial x_{1}'}{\partial x_{2}} \frac{\partial x_{2}'}{\partial x_{3}} A^{23} }$

$\displaystyle{+\frac{\partial x_{1}'}{\partial x_{3}} \frac{\partial x_{2}'}{\partial x_{1}} A^{31} + \frac{\partial x_{1}'}{\partial x_{3}} \frac{\partial x_{2}'}{\partial x_{2}} A^{32} + \frac{\partial x_{1}'}{\partial x_{3}} \frac{\partial x_{2}'}{\partial x_{3}} A^{33} }$

So, we just analysed the invariance of one of the flavours of tensors. Mathematically thinking, one should expect existence of something “like algebraic inverse” of  contravariant tensor because tensor is a generalization of vector and in linear algebra we study inverse operations. Let’s consider a situation when we want to analyse density of an object at different points. For simplicity, lets’ consider a point $A(x_1,x_2)$ on a plane surface with variable density.

A surface whose density is different in different parts

If we designate by $\psi$ the density at A, then $\frac{\partial \psi}{\partial x_1}$ and $\frac{\partial \psi}{\partial x_2}$ represent, respectively the partial variation of $\psi$ in the $x_1$ and $x_2$ directions. Although $\psi$ is a scalar quantity, the “change in $\psi$” is a directed quantity with components $\frac{\partial \psi}{\partial x_1}$ and $\frac{\partial \psi}{\partial x_2}$. Note that, “change in $\psi$” is a tensor of rank one because it depends upon the various directions. But it’s a tensor in a sense different from what we saw in case of “stress”. This “difference” will become clear once we analyse what happens to this quantity when the coordinate system is changed.

Now our motive is to express $\frac{\partial \psi}{\partial x_1'}$ , $\frac{\partial \psi}{\partial x_2'}$  in terms of $\frac{\partial \psi}{\partial x_1}$ , $\frac{\partial \psi}{\partial x_2}$ . Note that, a change in $x_1'$ will affect “both” $x_1$ and  $x_2$  (as seen in rotation of 2D axis in case of vector). Hence, the resulting changes in $x_1$  and $x_2$ will affect $\psi$

$\displaystyle{\frac{\partial \psi}{\partial x_1'} = \frac{\partial \psi}{\partial x_1} \frac{\partial x_1}{\partial x_1'} + \frac{\partial \psi}{\partial x_2} \frac{\partial x_2}{\partial x_1'}; \quad \frac{\partial \psi}{\partial x_2'} = \frac{\partial \psi}{\partial x_1} \frac{\partial x_1}{\partial x_2'} + \frac{\partial \psi}{\partial x_2} \frac{\partial x_2}{\partial x_2'}}$

Here we have used the idea that if x,y, z are three variables such that y and z depend on x and the calculation of the change in z per unit change in x NOT easy, then we can calculate it using: $\frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx}$. We can rewrite this system in condensed form as:

$\displaystyle{\frac{\partial \psi}{\partial x_{\mu}'} = \sum_{\sigma} \frac{\partial \psi}{\partial x_{\sigma}} \frac{\partial x_{\sigma}}{\partial x_{\mu}'} }$

for $\mu =1,2$ and $\sigma =1,2$. We can further abbreviate it by omitting the summation symbol $\sum_{\sigma}$ with the understanding that whenever a subscript occurs twice in a single term, we do summation on that subscript.

$\displaystyle{\frac{\partial \psi}{\partial x_{\mu}'} = \frac{\partial \psi}{\partial x_{\sigma}} \frac{\partial x_{\sigma}}{\partial x_{\mu}'} }$

for $\mu =1,2$ and $\sigma =1,2$. Finally replacing $\frac{\partial \psi}{\partial x_{\mu}'}$ by $A_{\mu}'$ and $\frac{\partial \psi}{\partial x_{\sigma}}$ by $A_{\sigma}$ (to make it similar to notation introduced in case of stress tensor)

$\displaystyle{\boxed{A_{ \mu}' = \frac{\partial x_{\sigma}}{\partial x_{\mu}'} A_{\sigma}}}$

where $A_1, A_2, \ldots$  are components of a vector in certain coordinate system. Any set of quantities which transforms according to this equation is defined to be a covariant vector . Moreover, we can generalize this equation to a tensor of any rank.  For example, a covariant tensor of rank two is defined by:

$\displaystyle{A_{ \alpha \beta}' = \frac{\partial x_{\gamma}}{\partial x_{\alpha}'} \frac{\partial x_{\delta}}{\partial x_{\beta}'} A_{\gamma \delta}}$

where the sum is over the indices $\gamma$ and $\delta$ (since they occur twice in the term on right).

Comparing the (boxed) equations describing contravariant and covariant vectors, we observe that the coefficients on the right are reciprocal of each other (as promised…).  Moreover, all these boxed equations represent the law of transformation for tensors of rank one (a.k.a. vectors), which can be generalized to a tensor of any rank.

Our final task is to see how these two flavours of tensors interact with each other. Let’s study the algebraic operations of addition and multiplication for both flavours of tensors, just like the way we did for vectors (note that vector product = dot product, because cross product can’t be generalized to n-dimensional vectors).

First consider the case of contravariant tensors. Let $A^{\alpha}$ be a vector having two components $A^{1}$ and $A^{2}$ in a plane and $B^{\alpha}$ be another such vector. If we define  $A^{\alpha}+B^\alpha = C^\alpha$ and $A^{\alpha} B^\beta = C^{\alpha \beta}$ (this allows 4 components, namely $C^{11}, C^{12}, C^{21}, C^{22}$)  with

$\displaystyle{A^{' \lambda} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} A^{\alpha}; \quad B^{' \mu} = \frac{\partial x_{\mu}'}{\partial x_{\beta}} B^{\beta}}$

for $\lambda, \mu, \alpha, \beta =1,2$, then on their addition and multiplication (called outer multiplication) we get:

$\displaystyle{C^{' \lambda} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} C^{\alpha}; \quad C^{' \lambda \mu} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} \frac{\partial x_{\mu}'}{\partial x_{\beta}} C^{\alpha \beta}}$

for $\lambda, \mu, \alpha, \beta =1,2$. One can prove this by patiently multiplying each term and then rearranging them. In general, if two contravariant tensors of rank m and n respectively, are multiplied together, the result is a contravariant tensor of rank m+n.

For the case of covariant tensors, the addition and (outer) multiplication is done in same manner as above. Let $A_{\alpha}$ be a vector having two components $A_{1}$ and $A_{2}$ in a plane and $B_{\alpha}$ be another such vector. If we define  $A_{\alpha}+B_\alpha = C_\alpha$ and $A_{\alpha} B_\beta = C_{\alpha \beta}$ (this allows 4 components, namely $C_{11}, C_{12}, C_{21}, C_{22}$)  with

$\displaystyle{A_{ \lambda}' = \frac{\partial x_{\alpha}}{\partial x_{\lambda}'} A_{\alpha}; \quad B_{ \mu}' = \frac{\partial x_{\beta}}{\partial x_{\mu}'} B_{\beta}}$

for $\lambda, \mu, \alpha, \beta =1,2$, then on their addition and multiplication (called outer multiplication) we get:

$\displaystyle{C_{ \lambda} '= \frac{\partial x_{\alpha}}{\partial x_{\lambda}'} C_{\alpha}; \quad C_{ \lambda \mu}'= \frac{\partial x_{\alpha}} {\partial x_{\lambda}'}\frac{\partial x_{\beta}}{\partial x_{\mu}'} C_{\alpha \beta}}$

for $\lambda, \mu, \alpha, \beta =1,2$. In general, if two covariant tensors of rank m and n respectively, are multiplied together, the result is a covariant tensor of rank m+n.

Now, as promised, it’s the time to see how both of these flavours of tensors interact with each other.  Let’s extend the notion of outer multiplication defined for each flavour of tensor, to outer product of a contravariant tensor with a covariant tensor. For example, consider vectors (a.k.a. tensors of rank 1) of each type:

$\displaystyle{A^{' \lambda} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} A^{\alpha}; \quad B_{ \mu}' = \frac{\partial x_{\beta}}{\partial x_{\mu}'} B_{\beta}}$

then their outer product leads to

$\displaystyle{C^{' \lambda}_{ \mu}= \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} \frac{\partial x_{\beta}}{\partial x_{\mu}'} C^{\alpha}_{\beta}}$

where $A^\alpha B_\beta = C^\alpha _\beta$. This is neither a contravarient nor a covariant tensor, hence is rather called a mixed tensor of rank 2. More generally, if a contravariant tensor of rank m and a covariant tensor of rank  n  are multiplied together so as to form their outer product, the result is a mixed tensor of rank m+n.

In general, if two mixed tensors of rank m (having $m_1$ indices/superscripts of contravariance and $m_2$ indices/subscripts of covariance, such that $m_1+m_2=m$) and n (having $n_1$ indices/superscripts of contravariance and $n_2$ indices/subscripts of covariance, such that $n_1+n_2=n$)  respectively, are multiplied together, the result is a mixed tensor of rank m+n (having $m_1+n_1$ indices/superscripts of contravariance and $m_2+n_2$ indices/subscripts of covariance, such that $m_1+n_1+m_2+n_2=m+n$) .

Unlike the previous two types of tensors, we can’t illustrate it using a simple physical example. To convince yourself, consider following two mixed tensors of rank 3 and rank 2, respectively:

$\displaystyle{A^{'\alpha \beta}_{\gamma} = \frac{\partial x_{\nu} }{\partial x_{\gamma}'}\frac{\partial x_{\alpha}'}{\partial x_{\lambda}}\frac{\partial x_{\beta}'}{\partial x_{\mu}} A^{\lambda \mu}_{\nu}; \quad B^{'\kappa}_{\delta} = \frac{\partial x_{\sigma}}{\partial x_{\delta}'}\frac{\partial x_{\kappa}'}{\partial x_{\rho}}B^{\rho}_{\sigma}}$

then following the notations introduced, their outer product is of rank 5 and is given by

$\displaystyle{C^{'\alpha\beta\kappa}_{\gamma\delta} = \frac{\partial x_{\nu} }{\partial x_{\gamma}'} \frac{\partial x_{\sigma}}{\partial x_{\delta}'}\frac{\partial x_{\alpha}'}{\partial x_{\lambda}}\frac{\partial x_{\beta}'}{\partial x_{\mu}}\frac{\partial x_{\kappa}'}{\partial x_{\rho}} C^{\lambda\mu\rho}_{\nu\sigma}}$

Behind this notation, the processes are really complicated. Now, suppose that we are working in 3D vector space. Then, the transformation law for tensor $\mathbf{A}$ represents a set of 27 (=3^3) equations with each equation having 27 terms on the right. And the transformation law for tensor $\mathbf{B}$ represents a set of 9 (=3^2) equations with each equation having 9 terms on the right. Therefore, the transformation law of their outer product tensor  $\mathbf{C}$ represents a set of 243 (=3^5) equations with each equation having 243 terms on the right.

So, unlike previous two cases of contravarient and covariant tensors, the proof of outer product of mixed tens;ors is rather complicated and out of scope for discussion in this introductory post.

Reference:

[L] Lillian R. Lieber, The Einstein Theory of Relativity. Internet Archive: https://archive.org/details/einsteintheoryof032414mbp