When I was in high school, two cool things that I learned from physics were “vectors” and “calculus”. I was (and am still) awestruck by following statement

Uniform circular motion is an accelerated motion.

At college, during my first (and second last) physics course, I was taught “vector calculus” and I didn’t enjoy it. Last year I learned “linear algebra”, that is the study of vector spaces, matrices, linear transformations… Also, a few months ago I wrote about my understanding of Algebra. In it I briefly mentioned that “…study of symmetry of equations, geometric objects, etc. became one of the central topics of interest…” and this lead to what we call “Abstract Algebra” of which linear algebra is a part. Following video by 3blue1brown explains how our understanding of vectors from physics can be used to develop the subject of linear algebra.

But, one should ask : “Why do we care to classify physical quantities as scalars and vectors?”. The answer to this question lies in the quest of physics to find the *invariants*, in terms of which we can state the Laws of nature. In general, the idea of finding invariants is a useful problem solving strategy in mathematics (the language of physics). For example, consider following problem from the book “Problem-Solving Strategies” by Arthur Engel:

Suppose the positive integer n is odd. The numbers 1,2,…, 2n are written on the blackboard. Then one can pick any two numbers a and b, erase them and write instead |a-b|. Prove that an odd number will remain at the end.

To prove this statement, one will have to use the concept of “parity” of the sum 1+2+3+…+2n as the invariant. And as stated in the video above, vectors are invariant under transformations of coordinate systems (*the components change, but length and direction of arrow remain unchanged*). For example, consider the rotation of 2D axis by angle θ, keeping the origin fixed.

Now, we can rewrite this by using and instead of and ; and putting different subscripts on the single letter instead of functions of :

Now we differentiate this system of equations to get:

where . We can rewrite this system in condensed form as:

for and . We can further abbreviate it by omitting the summation symbol with the understanding that whenever a subscript occurs twice in a single term, we do summation on that subscript.

for and . *This equation represents ANY transformation of coordinates whenever the values of and are in one-to-one correspondence.* Moreover, it can be extended to represent transformation of coordinates of any n-dimensional vector. For example, if and then it represents coordinate transformations of a 3-dimensional vector.

But, there are physical quantities which can’t be classified as scalar or vector. For example, “stress”: the internal force experienced by a material due to the “strain” caused by external force; is described as a “tensor of rank 2”. This is so, because the stress at any point on the surface depends upon the external force vector and area vector i.e. it describes things happening due to interaction between two vectors. The Cauchy stress tensor consists of nine components that completely define the state of stress at a point inside a material in the deformed state (where i corresponds to the force component direction and j corresponds to the area component direction). The tensor relates a unit-length direction vector **n** to the stress vector across an imaginary surface perpendicular to **n**:

where,

where , and are normal stresses, and , , , , and are shear stresses. We can represent the stress vector acting on a plane with normal unit vector **n**, as:

**n**. So, the force acting on the plane

**n**is the reaction exerted by the other half of the parallelepiped and has an opposite sign.

In this terminology, a scalar is a tensor of rank zero and a vector is a tensor of rank one. Moreover, in an n-dimensional space:

- a vector has n components
- a tensor of rank two has n^2 components
- a tensor of rank three has n^3 components
- and so on …

Just like vectors, tensors in general are invariant under transformations of coordinate systems. We wish to exploit this further. Let’s reconsider the boxed equation stated earlier. Since we are working with Euclidean metric i.e the length s of vector is given by , we have i.e. and are the components of . So, replacing and by and we get (motivation is to capture the idea of area vector)

where are components of a vector in certain coordinate system (note that superscripts are just for indexing purposes and do NOT represent exponents). Any set of quantities which transforms according to this equation is defined to be a contravariant vector . Moreover, we can generalize this equation to a tensor of any rank. For example, a contravariant tensor of rank two is defined by:

where the sum is over the indices and (since they occur twice in the term on right). We can illustrate this for 3 dimensional space, i.e. but summation performed only on and ; for instance, if and then we have:

So, we just analysed the *invariance* of one of the flavours of tensors. Mathematically thinking, one should expect existence of something “like algebraic inverse” of contravariant tensor because tensor is a generalization of vector and in linear algebra we study inverse operations. Let’s consider a situation when we want to analyse density of an object at different points. For simplicity, lets’ consider a point on a plane surface with variable density.

If we designate by the density at A, then and represent, respectively the partial variation of in the and directions. Although is a scalar quantity, the “**change in **” is a directed quantity with components and . Note that, “change in ” is a tensor of rank one because it depends upon the various directions. But it’s a tensor in a sense different from what we saw in case of “stress”. This “difference” will become clear once we analyse what happens to this quantity when the coordinate system is changed.

Now our motive is to express , in terms of , . Note that, a change in will affect “both” and (as seen in rotation of 2D axis in case of vector). Hence, the resulting changes in and will affect

Here we have used the idea that if x,y, z are three variables such that y and z depend on x and the calculation of the change in z per unit change in x NOT easy, then we can calculate it using: . We can rewrite this system in condensed form as:

for and . We can further abbreviate it by omitting the summation symbol with the understanding that whenever a subscript occurs twice in a single term, we do summation on that subscript.

for and . Finally replacing by and by (to make it similar to notation introduced in case of stress tensor)

where are components of a vector in certain coordinate system. Any set of quantities which transforms according to this equation is defined to be a covariant vector . Moreover, we can generalize this equation to a tensor of any rank. For example, a covariant tensor of rank two is defined by:

where the sum is over the indices and (since they occur twice in the term on right).

Comparing the (boxed) equations describing contravariant and covariant vectors, we observe that the coefficients on the right are reciprocal of each other (as promised…). Moreover, *all these boxed equations represent the law of transformation for tensors of rank one (a.k.a. vectors), which can be generalized to a tensor of any rank.*

Our final task is to see how these two flavours of tensors interact with each other. Let’s study the algebraic operations of addition and multiplication for both flavours of tensors, just like the way we did for vectors (note that vector product = dot product, because cross product can’t be generalized to n-dimensional vectors).

First consider the case of contravariant tensors. Let be a vector having two components and in a plane and be another such vector. If we define and (this allows 4 components, namely ) with

for , then on their addition and multiplication (called outer multiplication) we get:

for . *One can prove this by patiently multiplying each term and then rearranging them.* In general, if two contravariant tensors of rank m and n respectively, are multiplied together, the result is a contravariant tensor of rank m+n.

For the case of covariant tensors, the addition and (outer) multiplication is done in same manner as above. Let be a vector having two components and in a plane and be another such vector. If we define and (this allows 4 components, namely ) with

for , then on their addition and multiplication (called outer multiplication) we get:

for . In general, if two covariant tensors of rank m and n respectively, are multiplied together, the result is a covariant tensor of rank m+n.

Now, as promised, it’s the time to see how both of these flavours of tensors interact with each other. Let’s extend the notion of outer multiplication defined for each flavour of tensor, to outer product of a contravariant tensor with a covariant tensor. For example, consider vectors (a.k.a. tensors of rank 1) of each type:

then their outer product leads to

where . This is neither a contravarient nor a covariant tensor, hence is rather called a mixed tensor of rank 2. More generally, if a contravariant tensor of rank m and a covariant tensor of rank n are multiplied together so as to form their outer product, the result is a mixed tensor of rank m+n.

In general, if two mixed tensors of rank m (having indices/superscripts of contravariance and indices/subscripts of covariance, such that ) and n (having indices/superscripts of contravariance and indices/subscripts of covariance, such that ) respectively, are multiplied together, the result is a mixed tensor of rank m+n (having indices/superscripts of contravariance and indices/subscripts of covariance, such that ) .

Unlike the previous two types of tensors, we can’t illustrate it using a simple physical example. To convince yourself, consider following two mixed tensors of rank 3 and rank 2, respectively:

then following the notations introduced, their outer product is of rank 5 and is given by

Behind this notation, the processes are really complicated. Now, suppose that we are working in 3D vector space. Then, the transformation law for tensor represents a set of 27 (=3^3) equations with each equation having 27 terms on the right. And the transformation law for tensor represents a set of 9 (=3^2) equations with each equation having 9 terms on the right. Therefore, the transformation law of their outer product tensor represents a set of 243 (=3^5) equations with each equation having 243 terms on the right.

*So, unlike previous two cases of contravarient and covariant tensors, the proof of outer product of mixed tens;ors is rather complicated and out of scope for discussion in this introductory post.*

**Reference:**

[L] Lillian R. Lieber, *The Einstein Theory of Relativity. *Internet Archive: https://archive.org/details/einsteintheoryof032414mbp