A peek into the world of tensors

Standard

When I was in high school, two cool things that I learned from physics were “vectors” and “calculus”. I was (and am still) awestruck by following statement

Uniform circular motion is an accelerated motion.

At college, during my first (and second last) physics course, I was taught “vector calculus” and I didn’t enjoy it. Last year I learned “linear algebra”, that is the study of vector spaces, matrices, linear transformations… Also, a few months ago I wrote about my understanding of Algebra. In it I briefly mentioned that “…study of symmetry of equations, geometric objects, etc. became one of the central topics of interest…” and this lead to what we call “Abstract Algebra” of which linear algebra is a part. Following video by 3blue1brown explains how our understanding of vectors from physics can be used to develop the subject of linear algebra.

But, one should ask : “Why do we care to classify physical quantities as scalars and vectors?”. The answer to this question lies in the quest of physics to find the invariants, in terms of which we can state the Laws of nature. In general, the idea of finding invariants is a useful problem solving strategy in mathematics (the language of physics). For example, consider following problem from the book “Problem-Solving Strategies” by Arthur Engel:

Suppose the positive integer n is  odd. The numbers 1,2,…, 2n are written on the blackboard. Then one can pick any two numbers a and b, erase them and write instead |a-b|. Prove that an odd number will remain at the end.

To prove this statement, one will have to use the concept of “parity” of the sum 1+2+3+…+2n as the invariant. And as stated in the video above, vectors are invariant under transformations of coordinate systems (the components change, but length and direction of arrow remain unchanged). For example, consider the rotation of 2D axis by angle θ, keeping the origin fixed.

rotation_of_coordinates

By Guy vandegrift (Own work) [CC BY-SA 3.0], via Wikimedia Commons

\displaystyle{x' =  x \cos \theta + y \sin \theta; \quad y'= -x \sin \theta + y \cos \theta}

Now, we can rewrite this by using x_1 and x_2 instead of x and y; and putting different subscripts on the single letter a instead of functions of \theta:

\displaystyle{x_1' =  a_{11} x_1 + a_{12}x_2; \quad x_2'=  a_{21}x_1 + a_{22}x_2}

Now we differentiate this system of equations to get:

\displaystyle{dx_1'= \frac{\partial x_1'}{\partial x_1} dx_1 + \frac{\partial x_1'}{\partial x_2} dx_2 ; \quad dx_2' = \frac{\partial x_2'}{\partial x_1} dx_1  + \frac{\partial x_2'}{\partial x_2} dx_2 }

where a_{ij} = \frac{\partial x_i'}{\partial x_j} . We can rewrite this system in condensed form as:

\displaystyle{dx_{\mu}' = \sum_{\sigma} \frac{\partial x_{\mu}'}{\partial x_{\sigma}} dx_{\sigma}}

for \mu =1,2 and \sigma =1,2. We can further abbreviate it by omitting the summation symbol \sum_{\sigma} with the understanding that whenever a subscript occurs twice in a single term, we do summation on that subscript.

\displaystyle{\boxed{dx_{\mu}' = \frac{\partial x_{\mu}'}{\partial x_{\sigma}} dx_{\sigma} }}

for \mu =1,2 and \sigma =1,2. This equation represents ANY transformation of coordinates whenever the values of (x_{\sigma}) and (x_{\mu}') are in one-to-one correspondence. Moreover, it can be extended to represent transformation of coordinates of any n-dimensional vector. For example, if  \mu =1,2,3 and \sigma =1,2,3 then it represents coordinate  transformations of a 3-dimensional vector.


But, there are physical quantities which can’t be classified as scalar or vector.  For example, “stress”: the internal force experienced by a material due to the “strain” caused by external force; is described as a “tensor of rank 2”. This is so, because the stress at any point on the surface depends upon the external force vector and area vector i.e. it describes things happening due to interaction between two vectors. The Cauchy stress tensor \boldsymbol{\sigma} consists of nine components \sigma_{ij} that completely define the state of stress at a point inside a material in the deformed state (where i corresponds to the force component direction and j corresponds to the area component direction). The tensor relates a unit-length direction vector  n to the stress vector  \mathbf{T}^{(\mathbf{n})} across an imaginary surface perpendicular to n:

\displaystyle{\mathbf{T}^{(\mathbf n)}= \mathbf n \cdot\boldsymbol{\sigma}\quad \text{or} \quad T_j^{(n)}= \sigma_{ij}n_i}        where,  \boldsymbol{\sigma} = \left[{\begin{matrix} \mathbf{T}^{(\mathbf{e}_1)} \\  \mathbf{T}^{(\mathbf{e}_2)} \\  \mathbf{T}^{(\mathbf{e}_3)} \\  \end{matrix}}\right] =  \left[{\begin{matrix}  \sigma _{11} & \sigma _{12} & \sigma _{13} \\  \sigma _{21} & \sigma _{22} & \sigma _{23} \\  \sigma _{31} & \sigma _{32} & \sigma _{33} \\  \end{matrix}}\right]
where \sigma_{11}, \sigma_{22} and \sigma_{33} are normal stresses, and \sigma_{12}, \sigma_{13}, \sigma_{21}, \sigma_{23}, \sigma_{31} and \sigma_{32} are shear stresses. We can represent the stress vector acting on a plane with normal unit vector n, as:

cauchy_tetrahedron

By Sanpaz (Own work) [CC BY-SA 3.0 or GFDL], via Wikimedia Commons

Here, the tetrahedron is formed by slicing a parallelepiped along an arbitrary plane n. So, the force acting on the plane n is the reaction exerted by the other half of the parallelepiped and has an opposite sign.

In this terminology, a scalar is a tensor of rank zero and a vector is a tensor of rank one. Moreover, in an n-dimensional space:

  • a vector has n components
  • a tensor of rank two has n^2 components
  • a tensor of rank three has n^3 components
  • and so on …

Just like vectors, tensors in general are invariant under transformations of coordinate systems. We wish to exploit this further. Let’s reconsider the boxed equation stated earlier.  Since we are working with Euclidean metric i.e the length s of vector is given by s^2=x_1^2+x_2^2, we have ds^2=dx_1^2+dx_2^2 i.e. dx_1 and dx_2 are the components of ds. So, replacing dx_1 and dx_2 by A^1 and A^2  we get (motivation is to capture the idea of area vector)

\displaystyle{\boxed{A^{' \mu} = \frac{\partial x_{\mu}'}{\partial x_{\sigma}} A^{\sigma}}}

where A^1, A^2, A^3, \ldots  are components of a vector in certain coordinate system (note that superscripts are just for indexing purposes and do NOT represent exponents). Any set of quantities which transforms according to this equation is defined to be a contravariant vector . Moreover, we can generalize this equation to a tensor of any rank.  For example, a contravariant tensor of rank two is defined by:

\displaystyle{A^{' \alpha \beta} = \frac{\partial x_{\alpha}'}{\partial x_{\gamma}} \frac{\partial x_{\beta}'}{\partial x_{\delta}} A^{\gamma \delta}}

where the sum is over the indices \gamma and \delta (since they occur twice in the term on right). We can illustrate this for 3 dimensional space, i.e. \alpha , \beta , \gamma , \delta = 1,2,3 but summation performed only on  \gamma  and  \delta; for instance, if  \alpha=1 and  \beta=2 then we have:

\displaystyle{A^{' 12} = \frac{\partial x_{1}'}{\partial x_{1}} \frac{\partial x_{2}'}{\partial x_{1}} A^{11} + \frac{\partial x_{1}'}{\partial x_{1}} \frac{\partial x_{2}'}{\partial x_{2}} A^{12} + \frac{\partial x_{1}'}{\partial x_{1}} \frac{\partial x_{2}'}{\partial x_{3}} A^{13}+ \frac{\partial x_{1}'}{\partial x_{2}} \frac{\partial x_{2}'}{\partial x_{1}} A^{21}  + \frac{\partial x_{1}'}{\partial x_{2}} \frac{\partial x_{2}'}{\partial x_{2}} A^{22} + \frac{\partial x_{1}'}{\partial x_{2}} \frac{\partial x_{2}'}{\partial x_{3}} A^{23} }

\displaystyle{+\frac{\partial x_{1}'}{\partial x_{3}} \frac{\partial x_{2}'}{\partial x_{1}} A^{31} + \frac{\partial x_{1}'}{\partial x_{3}} \frac{\partial x_{2}'}{\partial x_{2}} A^{32}  + \frac{\partial x_{1}'}{\partial x_{3}} \frac{\partial x_{2}'}{\partial x_{3}} A^{33} }


So, we just analysed the invariance of one of the flavours of tensors. Mathematically thinking, one should expect existence of something “like algebraic inverse” of  contravariant tensor because tensor is a generalization of vector and in linear algebra we study inverse operations. Let’s consider a situation when we want to analyse density of an object at different points. For simplicity, lets’ consider a point A(x_1,x_2) on a plane surface with variable density.

untitled-drawing

A surface whose density is different in different parts

If we designate by \psi the density at A, then \frac{\partial \psi}{\partial x_1} and \frac{\partial \psi}{\partial x_2} represent, respectively the partial variation of \psi in the x_1 and x_2 directions. Although \psi is a scalar quantity, the “change in \psi” is a directed quantity with components \frac{\partial \psi}{\partial x_1} and \frac{\partial \psi}{\partial x_2}. Note that, “change in \psi” is a tensor of rank one because it depends upon the various directions. But it’s a tensor in a sense different from what we saw in case of “stress”. This “difference” will become clear once we analyse what happens to this quantity when the coordinate system is changed.

Now our motive is to express \frac{\partial \psi}{\partial x_1'} , \frac{\partial \psi}{\partial x_2'}  in terms of \frac{\partial \psi}{\partial x_1} , \frac{\partial \psi}{\partial x_2} . Note that, a change in x_1' will affect “both” x_1 and  x_2  (as seen in rotation of 2D axis in case of vector). Hence, the resulting changes in x_1  and x_2 will affect \psi

\displaystyle{\frac{\partial \psi}{\partial x_1'} = \frac{\partial \psi}{\partial x_1} \frac{\partial x_1}{\partial x_1'} + \frac{\partial \psi}{\partial x_2} \frac{\partial x_2}{\partial x_1'}; \quad  \frac{\partial \psi}{\partial x_2'} = \frac{\partial \psi}{\partial x_1} \frac{\partial x_1}{\partial x_2'} + \frac{\partial \psi}{\partial x_2} \frac{\partial x_2}{\partial x_2'}}

Here we have used the idea that if x,y, z are three variables such that y and z depend on x and the calculation of the change in z per unit change in x NOT easy, then we can calculate it using: \frac{dz}{dx}  = \frac{dz}{dy} \frac{dy}{dx}. We can rewrite this system in condensed form as:

\displaystyle{\frac{\partial \psi}{\partial x_{\mu}'} = \sum_{\sigma} \frac{\partial \psi}{\partial x_{\sigma}} \frac{\partial x_{\sigma}}{\partial x_{\mu}'} }

for \mu =1,2 and \sigma =1,2. We can further abbreviate it by omitting the summation symbol \sum_{\sigma} with the understanding that whenever a subscript occurs twice in a single term, we do summation on that subscript.

\displaystyle{\frac{\partial \psi}{\partial x_{\mu}'} =  \frac{\partial \psi}{\partial x_{\sigma}} \frac{\partial x_{\sigma}}{\partial x_{\mu}'} }

for \mu =1,2 and \sigma =1,2. Finally replacing \frac{\partial \psi}{\partial x_{\mu}'} by A_{\mu}' and \frac{\partial \psi}{\partial x_{\sigma}} by A_{\sigma} (to make it similar to notation introduced in case of stress tensor)

\displaystyle{\boxed{A_{ \mu}' = \frac{\partial x_{\sigma}}{\partial x_{\mu}'} A_{\sigma}}}

where A_1, A_2, \ldots  are components of a vector in certain coordinate system. Any set of quantities which transforms according to this equation is defined to be a covariant vector . Moreover, we can generalize this equation to a tensor of any rank.  For example, a covariant tensor of rank two is defined by:

\displaystyle{A_{ \alpha \beta}' = \frac{\partial x_{\gamma}}{\partial x_{\alpha}'} \frac{\partial x_{\delta}}{\partial x_{\beta}'} A_{\gamma \delta}}

where the sum is over the indices \gamma and \delta (since they occur twice in the term on right).

Comparing the (boxed) equations describing contravariant and covariant vectors, we observe that the coefficients on the right are reciprocal of each other (as promised…).  Moreover, all these boxed equations represent the law of transformation for tensors of rank one (a.k.a. vectors), which can be generalized to a tensor of any rank.


Our final task is to see how these two flavours of tensors interact with each other. Let’s study the algebraic operations of addition and multiplication for both flavours of tensors, just like the way we did for vectors (note that vector product = dot product, because cross product can’t be generalized to n-dimensional vectors).

First consider the case of contravariant tensors. Let A^{\alpha} be a vector having two components A^{1} and A^{2} in a plane and B^{\alpha} be another such vector. If we define  A^{\alpha}+B^\alpha = C^\alpha and A^{\alpha} B^\beta = C^{\alpha \beta} (this allows 4 components, namely C^{11}, C^{12}, C^{21}, C^{22})  with

\displaystyle{A^{' \lambda} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} A^{\alpha}; \quad B^{' \mu} = \frac{\partial x_{\mu}'}{\partial x_{\beta}} B^{\beta}}

for \lambda, \mu, \alpha, \beta =1,2, then on their addition and multiplication (called outer multiplication) we get:

\displaystyle{C^{' \lambda} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} C^{\alpha}; \quad C^{' \lambda \mu} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} \frac{\partial x_{\mu}'}{\partial x_{\beta}} C^{\alpha \beta}}

for \lambda, \mu, \alpha, \beta =1,2. One can prove this by patiently multiplying each term and then rearranging them. In general, if two contravariant tensors of rank m and n respectively, are multiplied together, the result is a contravariant tensor of rank m+n.

For the case of covariant tensors, the addition and (outer) multiplication is done in same manner as above. Let A_{\alpha} be a vector having two components A_{1} and A_{2} in a plane and B_{\alpha} be another such vector. If we define  A_{\alpha}+B_\alpha = C_\alpha and A_{\alpha} B_\beta = C_{\alpha \beta} (this allows 4 components, namely C_{11}, C_{12}, C_{21}, C_{22})  with

\displaystyle{A_{ \lambda}' = \frac{\partial x_{\alpha}}{\partial x_{\lambda}'} A_{\alpha}; \quad B_{ \mu}' = \frac{\partial x_{\beta}}{\partial x_{\mu}'} B_{\beta}}

for \lambda, \mu, \alpha, \beta =1,2, then on their addition and multiplication (called outer multiplication) we get:

\displaystyle{C_{ \lambda} '= \frac{\partial x_{\alpha}}{\partial x_{\lambda}'} C_{\alpha}; \quad C_{ \lambda \mu}'= \frac{\partial x_{\alpha}} {\partial x_{\lambda}'}\frac{\partial x_{\beta}}{\partial x_{\mu}'} C_{\alpha \beta}}

for \lambda, \mu, \alpha, \beta =1,2. In general, if two covariant tensors of rank m and n respectively, are multiplied together, the result is a covariant tensor of rank m+n.

Now, as promised, it’s the time to see how both of these flavours of tensors interact with each other.  Let’s extend the notion of outer multiplication defined for each flavour of tensor, to outer product of a contravariant tensor with a covariant tensor. For example, consider vectors (a.k.a. tensors of rank 1) of each type:

\displaystyle{A^{' \lambda} = \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} A^{\alpha}; \quad B_{ \mu}' = \frac{\partial x_{\beta}}{\partial x_{\mu}'} B_{\beta}}

then their outer product leads to

\displaystyle{C^{' \lambda}_{ \mu}= \frac{\partial x_{\lambda}'}{\partial x_{\alpha}} \frac{\partial x_{\beta}}{\partial x_{\mu}'} C^{\alpha}_{\beta}}

where A^\alpha B_\beta = C^\alpha _\beta. This is neither a contravarient nor a covariant tensor, hence is rather called a mixed tensor of rank 2. More generally, if a contravariant tensor of rank m and a covariant tensor of rank  n  are multiplied together so as to form their outer product, the result is a mixed tensor of rank m+n.

In general, if two mixed tensors of rank m (having m_1 indices/superscripts of contravariance and m_2 indices/subscripts of covariance, such that m_1+m_2=m) and n (having n_1 indices/superscripts of contravariance and n_2 indices/subscripts of covariance, such that n_1+n_2=n)  respectively, are multiplied together, the result is a mixed tensor of rank m+n (having m_1+n_1 indices/superscripts of contravariance and m_2+n_2 indices/subscripts of covariance, such that m_1+n_1+m_2+n_2=m+n) .

Unlike the previous two types of tensors, we can’t illustrate it using a simple physical example. To convince yourself, consider following two mixed tensors of rank 3 and rank 2, respectively:

\displaystyle{A^{'\alpha \beta}_{\gamma} = \frac{\partial x_{\nu} }{\partial x_{\gamma}'}\frac{\partial x_{\alpha}'}{\partial x_{\lambda}}\frac{\partial x_{\beta}'}{\partial x_{\mu}} A^{\lambda \mu}_{\nu}; \quad B^{'\kappa}_{\delta} = \frac{\partial x_{\sigma}}{\partial x_{\delta}'}\frac{\partial x_{\kappa}'}{\partial x_{\rho}}B^{\rho}_{\sigma}}

then following the notations introduced, their outer product is of rank 5 and is given by

\displaystyle{C^{'\alpha\beta\kappa}_{\gamma\delta} = \frac{\partial x_{\nu} }{\partial x_{\gamma}'} \frac{\partial x_{\sigma}}{\partial x_{\delta}'}\frac{\partial x_{\alpha}'}{\partial x_{\lambda}}\frac{\partial x_{\beta}'}{\partial x_{\mu}}\frac{\partial x_{\kappa}'}{\partial x_{\rho}} C^{\lambda\mu\rho}_{\nu\sigma}}

Behind this notation, the processes are really complicated. Now, suppose that we are working in 3D vector space. Then, the transformation law for tensor \mathbf{A} represents a set of 27 (=3^3) equations with each equation having 27 terms on the right. And the transformation law for tensor \mathbf{B} represents a set of 9 (=3^2) equations with each equation having 9 terms on the right. Therefore, the transformation law of their outer product tensor  \mathbf{C} represents a set of 243 (=3^5) equations with each equation having 243 terms on the right.

So, unlike previous two cases of contravarient and covariant tensors, the proof of outer product of mixed tens;ors is rather complicated and out of scope for discussion in this introductory post.


Reference:

[L] Lillian R. Lieber, The Einstein Theory of Relativity. Internet Archive: https://archive.org/details/einsteintheoryof032414mbp

Advertisements

4 responses »

  1. Pingback: Dimension clarification | Gaurish4Math

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s