Tag Archives: complex numbers

Polynomials and Commutativity


In high school, I came to know about the statement of the fundamental theorem of algebra:

Every polynomial of degree n with integer coefficients have exactly n complex roots (with appropriate multiplicity).

In high school, a polynomial = a polynomial in one variable. Then last year I learned 3 different proofs of the following statement of the fundamental theorem of algebra [involving, topology, complex analysis and Galois theory]:

Every non-zero, single-variable, degree n polynomial with complex coefficients has, counted with multiplicity, exactly n complex roots.

A more general statement about the number of roots of a polynomial in one variable is the Factor Theorem:

Let R be a commutative ring with identity and let p(x)\in R[x] be a polynomial with coefficients in R. The element a\in R is a root of p(x) if and only if (x-a) divides p(x).

A corollary of above theorem is that:

A polynomial f of degree n over a field F has at most n roots in F.

(In case you know undergraduate level algebra, recall that R[x] is a Principal Ideal Domain if and only if R is a field.)

The key fact that many times go unnoticed regarding the number of roots of a given polynomial (in one variable) is that the coefficients/solutions belong to a commutative ring (and \mathbb{C} is a field hence a commutative ring). The key step in the proof of all above theorems is the fact that the division algorithm holds only in some special commutative rings (like fields). I would like to illustrate my point with the following fact:

The equation X^2 + X + 1 has only 2 complex roots, namely \omega = \frac{-1+i\sqrt{3}}{2} and \omega^2 = \frac{-1-i\sqrt{3}}{2}. But if we want solutions over 2×2 matrices (non-commutative set) then we have at least  3 solutions (consider 1 as 2×2 identity matrix and 0 as the 2×2 zero matrix.)

\displaystyle{A=\begin{bmatrix} 0 & -1 \\1 & -1 \end{bmatrix}, B=\begin{bmatrix} \omega & 0 \\0 & \omega^2 \end{bmatrix}, C=\begin{bmatrix} \omega^2 & 0 \\0 & \omega \end{bmatrix}}

if we allow complex entries. This phenominona can also be illusttrated using a non-commutative number system, like quaternions. For more details refer to this Math.SE discussion.


Midpoint Polygon Conjecture is false


Contrary to my expectations, my previous post turned out to be like  Popular-­Lonely primes and Decimal Problem, i.e. I discovered nothing new.

My conjecture is false. Following counterexample is given on pp. 234 of this paper:


Counterexample of the conjecture, taken from: Berlekamp, E. R., E. N. Gilbert, and F. W. Sinden. “A Polygon Problem.” The American Mathematical Monthly 72, no. 3 (1965): 233-41. doi:10.2307/2313689.

As pointed out by uncombed_coconut, the correct theorem is:

Theorem (Berlekamp-Gilbert-Sinden, 1965). For almost all simple polygons there exist a smallest natural number k such that after k iterations of midpoint polygon procedure, we obtain a convex polygon.

The proof of this theorem is very interesting. Till now I thought that proving euclidean geometry theorems using complex numbers was an overkill. But using an N-tuple of complex numbers to represent the vertices of a closed polygon (in given order),  \mathbf{z} = (z_1,\ldots , z_N),  we can restate the problem in terms of eigenvectors (referred to as eigenpolygons) and eigenvalues.  Following are the crucial facts used is the proof:

  • An arbitrary N-gon (need not be simple) can be written as a sum of regular N-gons i.e. the eigenvalues are distinct.
  • The coefficient of k^{th} eigenvector (when N-gon is written as linear combination of eigenpolygons) is the centroid of the polygon obtained by “winding” \mathbf{z} k times.
  • All vertices of the midpoint polygons (obtained by repeating the midpoint polygon procedure infinitely many times) converge to the centroid.
  • The sum of two convex components of \mathbf{z} is a polygon. This polygon is the affine image of a regular convex N-gon whose all vertices lie on an ellipse. (as pointed out by Nikhil)
  • A necessary and sufficient condition for \mathbf{z} to have a convex midpoint polygon (after some finite iterations of the midpoint polygon procedure) is that the ellipse circumscribing the sum of two convex components of \mathbf{z} is nondegenerate. (The degenerate form of an ellipse is a point. )

For a nice outline of the proof, please refer to the comment by uncombed_coconut on previous post.

Since I didn’t know that this is a well studied problem (and that too by a well known mathematician!) I was trying to prove it on my own. Though I didn’t make much progress, but I discovered some interesting theorems which I will share in my future posts.

Real vs Complex Plane


Real plane is denoted by \mathbb{R}^2 and is commonly referred to as  Cartesian plane. When we talk about \mathbb{R}^2 we mean that \mathbb{R}^2 is a vector space over \mathbb{R}. But when you view \mathbb{R}^2 as Cartesian plane, then it’s not technically a vector space but rather an affine space, on which a vector space acts by translations, i.e. there is no canonical choice of where the origin should go in the space, because it can be translated anywhere.


Cartesian Plane (345Kai at the English language Wikipedia [Public domain, GFDL or CC-BY-SA-3.0], via Wikimedia Commons)

On the other hand, complex plane is denoted by \mathbb{C} and is commonly referred to as Argand plane. But when we talk about \mathbb{C}, we mean that \mathbb{R}^2 is a field (by exploiting the tuple structure of elements) since there is only way to explicitly define the field structure on the set \mathbb{R}^2 and that’s how we view \mathbb{C} as a field (if you allow axiom of choice, there are more possibilities; see this Math.SE discussion).


Argand Plane (Shiva Sitaraman at Quora)

So, when we want to bother about the vector space structure of \mathbb{R}^2 we refer to Cartesian plane and when we want to bother about the field structure of \mathbb{R}^2 we refer to Argand plane. An immediate consequence of the above difference in real and complex plane is seen when we study multivariable analysis and complex analysis, where we consider vector space structure and field structure, respectively (see this Math.SE discussion for details). Hence the definition of differentiation of a function defined on \mathbb{C} is a special case of definition of differentiation of a function defined on \mathbb{R}^2.

Real vs Complex numbers


I want to talk about the algebraic and analytic differences between real and complex numbers. Firstly, let’s have a look at following beautiful explanation by Richard Feynman (from his QED lectures) about similarities between real and complex numbers:


From Chapter 2 of the book “QED – The Strange Theory of Light and Matter” © Richard P. Feynman, 1985.

Before reading this explanation, I used to believe that the need to establish “Fundamental theorem Algebra” (read this beautiful paper by Daniel J. Velleman to learn about proof of this theorem) was only way to motivate study of complex numbers.

The fundamental difference between real and complex numbers is

Real numbers form an ordered field, but complex numbers can’t form an ordered field. [Proof]

Where we define ordered field as follows:

Let \mathbf{F} be a field. Suppose that there is a set \mathcal{P} \subset \mathbf{F} which satisfies the following properties:

  • For each x \in \mathbf{F}, exactly one of the following statements holds: x \in \mathcal{P}, -x \in \mathcal{P}, x =0.
  • For x,y \in \mathcal{P}, xy \in \mathcal{P} and x+y \in \mathcal{P}.

If such a \mathcal{P} exists, then \mathbf{F} is an ordered field. Moreover, we define x \le y \Leftrightarrow y -x \in \mathcal{P} \vee x = y.

Note that, without retaining the vector space structure of complex numbers we CAN establish the order for complex numbers [Proof], but that is useless. I find this consequence pretty interesting, because though \mathbb{R} and \mathbb{C} are isomorphic as additive groups (and as vector spaces over \mathbb{Q}) but not isomorphic as rings (and hence not isomorphic as fields).

Now let’s have a look at the consequence of the difference between the two number systems due to the order structure.

Though both real and complex numbers form a complete field (a property of topological spaces), but only real numbers have least upper bound property.

Where we define least upper bound property as follows:

Let \mathcal{S} be a non-empty set of real numbers.

  • A real number x is called an upper bound for \mathcal{S} if x \geq s for all s\in \mathcal{S}.
  • A real number x is the least upper bound (or supremum) for \mathcal{S} if x is an upper bound for \mathcal{S} and x \leq y for every upper bound y of \mathcal{S} .

The least-upper-bound property states that any non-empty set of real numbers that has an upper bound must have a least upper bound in real numbers.
This least upper bound property is referred to as Dedekind completeness. Therefore, though both \mathbb{R} and \mathbb{C} are complete as a metric space [proof] but only \mathbb{R} is Dedekind complete.

In an arbitrary ordered field one has the notion of Dedekind completeness — every nonempty bounded above subset has a least upper bound — and also the notion of sequential completeness — every Cauchy sequence converges. The main theorem relating these two notions of completeness is as follows [source]:

For an ordered field \mathbf{F}, the following are equivalent:
(i) \mathbf{F} is Dedekind complete.
(ii) \mathbf{F} is sequentially complete and Archimedean.

Where we defined an Archimedean field as an ordered field such that for each element there exists a finite expression 1+1+\ldots+1 whose value is greater than that element, that is, there are no infinite elements.

As remarked earlier, \mathbb{C} is not an ordered field and hence can’t be Archimedean. Therefore, \mathbb{C}  can’t have least-upper-bound property, though it’s complete in topological sense. So, the consequence of all this is:

We can’t use complex numbers for counting.

But still, complex numbers are very important part of modern arithmetic (number-theory), because they enable us to view properties of numbers from a geometric point of view [source].

Imaginary Angles


You would have heard about imaginary numbers and most famous of them is i=\sqrt{-1}. I personally don’t like this name because all of mathematics is man/woman made, hence all mathematical objects are imaginary (there is no perfect circle in nature…) and lack physical meaning. Moreover, these numbers are very useful in physics (a.k.a. the study of nature using mathematics). For example, “time-dependent Schrödinger equation

\displaystyle{i \hbar \frac{\partial}{\partial t}\Psi(\mathbf{r},t) = \hat H \Psi(\mathbf{r},t)}

But, as described here:

Complex numbers are a tool for describing a theory, not a property of the theory itself. Which is to say that they can not be the fundamental difference between classical and quantum mechanics (QM). The real origin of the difference is the non-commutative nature of measurement in QM. Now this is a property that can be captured by all kinds of beasts — even real-valued matrices. [Physics.SE]

For more of such interpretation see: Volume 1, Chapter 22 of “The Feynman Lectures in Physics”. And also this discussion about Hawking’s wave function.

All these facts may not have fascinated you, but the following fact from Einstein’s Special Relativity should fascinate you:

In 1908 Hermann Minkowski explained how the Lorentz transformation could be seen as simply a hyperbolic rotation of the spacetime coordinates, i.e., a rotation through an imaginary angle. [Wiki: Rapidity]

Irrespective of the fact that you do/don’t understand Einstein’s relativity, the concept of imaginary angle appears bizarre. But, mathematically its just another consequence of non-euclidean geometry which can be interpreted as Hyperbolic law of cosines etc. For example:

\displaystyle{\cos (\alpha+i\beta) = \cos (\alpha) \cosh (\beta) - i \sin (\alpha) \sinh (\beta)}

\displaystyle{\sin (\alpha+i\beta) = \sin (\alpha) \cosh (\beta) + i \cos (\alpha) \sinh (\beta)}

Let’s try to understand what is meant by “imaginary angle” by following the article “A geometric view of complex trigonometric functions” by Richard Hammack. Consider the complex unit circle  U=\{z,w\in \mathbb{C} \ :  \  z^2+w^2=1\} of \mathbb{C}^2, in a manner exactly analogous to the definition of the standard unit circle in \mathbb{R}^2. Apparently U is some sort of surface in \mathbb{C}^2, but it can’t be drawn as simply as the usual unit circle, owing to the four-dimensional character of \mathbb{C}^2. But we can examine its lower dimensional cross sections. For example, if  z=x+iy and w=u+iv then by setting y = 0 we get the circle x^2+u^2=1 in x-u plane for v=0 and the hyperbola x^2-v^2 = 1 in x-vi plane for u=0.


The cross-section of complex unit circle (defined by z^2+w^2=1 for complex numbers z and w) with the x-u-vi coordinate space (where z=x+iy and w=u+iv) © 2007 Mathematical Association of America

These two curves (circle and hyperbola) touch at the points ±o, where o=(1,0) in \mathbb{C}^2, as illustrated above. The symbol o is used by Richard Hammack because this point will turn out to be the origin of complex radian measure.

Let’s define complex distance between points \mathbf{a} =(z_1,w_1) and \mathbf{b}=(z_2,w_2) in \mathbb{C}^2 as


where square root is the half-plane H of \mathbb{C} consisting of the non-negative imaginary axis and the numbers with a positive real part. Therefore, the complex distance between two points in \mathbb{C}^2 is a complex number (with non-negative real part).

Starting at the point o in the figure above, one can move either along the circle or along the right-hand branch of the hyperbola. On investigating these two choices, we conclude that they involve traversing either a real or an imaginary distance. Generalizing the idea of real radian measure, we define imaginary radian measure to be the oriented arclength from o to a point p on the hyperbola, as


(a) Real radian measure (b) Imaginary radian measure. © 2007 Mathematical Association of America

If p is above the x axis, its radian measure is \beta i with \beta >0, while if it is below the x axis, its radian measure is \beta i with \beta <0. As in the real case, we define \cos (\beta i) and \sin (\beta i) to be the z and w coordinates of p. According to above figure (b), this gives

\displaystyle{\cos (\beta i) = \cosh (\beta); \qquad \sin (\beta i) = i \sinh (\beta)}

\displaystyle{\cos (\pi + \beta i) = -\cosh (\beta); \qquad \sin (\pi + \beta i) = -i \sinh (\beta)}

Notice that both these relations hold for both positive and negative values of \beta, and are in agreement with the expansions of  \cos (\alpha+i\beta)  and \sin (\alpha+i\beta)  stated earlier.

But, to “see” what a complex angle looks like we will have to examine the complex versions of lines and rays. Despite the four dimensional flavour, \mathbb{C}^2 is a two-dimensional vector space over the field \mathbb{C}, just like \mathbb{R}^2 over \mathbb{R}.

Since a line (through the origin) in \mathbb{R}^2 is the span of a nonzero vector, we define a complex line in \mathbb{C}^2 analogously. For a nonzero vector u in \mathbb{C}^2, the complex line \Lambda through u is span(u), which is isomorphic to the complex plane.

In \mathbb{R}^2, the ray \overline{\mathbf{u}} passing through a nonzero vector u can be defined as the set of all nonnegative real multiples of u. Extending this to \mathbb{C}^2 seems problematic, for the word “nonnegative” has no meaning in \mathbb{C}. Using the half-plane H (where complex square root is defined) seems a reasonable alternative. If u is a nonzero vector in \mathbb{C}, then the complex ray through u is the set \overline{\mathbf{u}} = \{\lambda u \ : \  \lambda\in H\}.

Finally, we define a complex angle is the union of two complex rays \overline{\mathbf{u}_1} and \overline{\mathbf{u}_2} .

I will end my post by quoting an application of imaginary angles in optics from here:

… in optics, when a light ray hits a surface such as glass, Snell’s law tells you the angle of the refracted beam, Fresnel’s equations tell you the amplitudes of reflected and transmitted waves at an interface in terms of that angle. If the incidence angle is very oblique when travelling from glass into air, there will be no refracted beam: the phenomenon is called total internal reflection. However, if you try to solve for the angle using Snell’s law, you will get an imaginary angle. Plugging this into the Fresnel equations gives you the 100% reflectance observed in practice, along with an exponentially decaying “beam” that travels a slight distance into the air. This is called the evanescent wave and is important for various applications in optics. [Mathematics.SE]

Colourful complex functions


Recently I became curious about functions defined from \mathbb{C} to \mathbb{C} and I asked myself following question:

How would the complex functions look like if we try to plot them?

Graphs of complex functions lie in \mathbb{C}^2, which can be identified in a natural way with \mathbb{R}^4, real four-dimensional space.

So I jumped to SageMath  and plotted z^2

sage: f(z) = z^2
sage: complex_plot(f, (-5, 5), (-5, 5))


graph of z^2 plotted using SageMath Version 7.0


Now this looked like an enigma to me. What do the colours stand for? As usual, there is an interesting entry about this on Wikipedia, Colour wheel graphs of complex functions.

I digged further and discovered that these are called “2D colour maps” and is one of many other ways of visualizing complex functions, like 3D models, 2D vector plots, 4D perspective projection,  conformal maps…


HLS Cylinder (By SharkDderivative [CC BY-SA 3.0 or GFDL], via Wikimedia Commons)

But what is the colour map? The colour map uses the HLS colour system (“hue-lightness-saturation”). HLS is a cylindrical-coordinate representations of points in an RGB color model.  In cylinder, the angle around the central vertical axis corresponds to “hue”(i.e. shade of a colour) , the distance from the axis corresponds to “saturation”, and the distance along the axis corresponds to “lightness”.


The argument φ and modulus r locate a point on an Argand diagram i.e. complex plane.(By Kan8eDie [CC BY-SA 3.0, CC BY-SA 3.0 or GFDL], via Wikimedia Commons)

The hue represents the argument (also called phase angle) of the complex number z. The absolute value (also called magnitude or modulus) is given by the lightness of the colour. All colours of the colour map have the maximal saturation (with respect to the given lightness).


HLS Colour Wheel (source: iliasky.com)

Positive real numbers always appear red. The primary colours appear at phase angles  \frac{2 \pi}{3} (green) and \frac{4\pi}{3} (blue). The subtractive colours yellow, cyan, and magenta have the phases \frac{\pi}{3}, \pi, and \frac{5\pi}{3}.

The poles of a complex function are white, the zeros are black.

Finally, to conclude [from : Visual quantum mechanics : selected topics with computer-generated animations of quantum-mechanical phenomena by Bernd Thaller.]

This colour map is obtained by a stereographic projection from the surface of the three-dimensional colour space (in the hue-lightness-saturation system) onto the complex plane.

An appropriately colored surface graphics or a density graphics can give a useful graphical representation of a complex valued function. Another example of complex valued function, a wave function, is given here:


Visualizations of a wave function in two dimensions. The left graphic shows the function as a “density plot” with additional contour lines for the absolute value. In the three-dimensional surface plot the height of the surface gives the absolute value of the wave function. (By Bernd Thaller, created using Mathematica. © 2000 Springer-Verlag New York, Inc.)

For more such graphs, visit Bernd Thaller’s Gallery of complex functions .