Category Archives: Problem Solving

Arithmetic Puzzle


Following is a very common arithmetic puzzle that you may have encountered as a child:

Express any whole number n using the number 2 precisely four times and using only well-known mathematical symbols.

This puzzle has been discussed on pp. 172 of Graham Farmelo’s “The Strangest Man“, and how Paul Dirac solved it by using his knowledge of “well-known mathematical symbols”:

\displaystyle{n = -\log_{2}\left(\log_{2}\left(2^{2^{-n}}\right)\right) = -\log_{2}\left(\log_{2}\left(\underbrace{\sqrt{\sqrt{\ldots\sqrt{2}}}}_\text{n times}\right)\right)}

This is an example of thinking out of the box, enabling you to write any number using only three/four 2s. Though, using a transcendental function to solve an elementary problem may appear like an overkill.  But, building upon such ideas we can try to tackle the general problem, like the “four fours puzzle“.

This post on Puzzling.SE describes usage of following formula consisting of  trigonometric operation \cos(\arctan(x)) = \frac{1}{\sqrt{1+x^2}} and \tan(\arcsin(x))=\frac{x}{\sqrt{1-x^2}} to obtain the square root of any rational number from 0:


Using this we can write n using two 2s:

\displaystyle{n = (\underbrace{\tan\arcsin\cos\arctan\cos\arctan}_{n-4\text{ times}}\,2)^2}

or even with only one 2:

\displaystyle{n = \underbrace{\tan\arcsin\cos\arctan\cos\arctan}_{n^2-4\text{ times}}\,2}

Understanding Geometry – 2


If you want to brush up your high school geometry knowledge, then KhanAcademy is a good place to start. For example, I learned a new proof of Pythagoras Theorem (there are 4 different proofs on KhanAcademy) which uses scissors-congruence:

In this post, I will share with you few theorems from L. I. Golovina and I. M. Yaglom’s “Induction in Geometry ”  which I learned while trying to prove Midpoint-Polygon Conjecture.

Theorem 1: The sum of interior angles of an n-gon is 2\pi (n-2).

Theorem 2: The number of ways in which a convex n-gon can be divided into triangles by non-intersecting diagonals is given by


Theorem 3: Given a \triangle ABC, with n-1 straight lines CM_1, CM_2, \ldots CM_{n-1} drawn through its vertex C, cutting the triangle into n smaller triangles \triangle ACM_1, \triangle M_1CM_2, \ldots, \triangle M_{n-1}CB. Denote by r_1, r_2, \ldots r_n and \rho_1, \rho_2, \ldots, \rho_n respectively the radii of the inscribed and circumscribed circles of these triangles (all the circumscribed circles are inscribed within the angle C of the triangle) and let r and \rho be the radii of the inscribed and circumscribed circles (respectively) of the \triangle ABC itself. Then

\displaystyle{\frac{r_1}{\rho_1} \cdot\frac{r_2}{\rho_2} \cdots\frac{r_n}{\rho_n} =\frac{r}{\rho} }

Theorem 4: Any convex n-gon which is not a parallelogram can be enclosed by a triangle whose sides lie along three sides of the given n-gon.

Theorem 5 (Levi’s Theorem): Any convex polygon which is not a parallelogram can be covered with three homothetic polygons smaller than the given one.

The above theorem gives a good idea of what “combinatorial geometry” is all about. In this subject, the method of mathematical induction is widely used for proving various theorems. Combinatorial geometry deals with problems, connected with finite configurations of points or figures. In these problems, values are estimated connected with configurations of figures (or points) which are optimal in some sense.

Theorem 6 (Newton’s Theorem): The midpoints of the diagonals of a quadrilateral circumscribed about a circle lie on one straight line passing through the centre of the circle.

Theorem 7 (Simson’s Theorem): Given a \triangle ABC inscribed in the circle S with an arbitrary point P on this circle. Then then feet of the perpendiculars dropped from the point P to the sides of the \triangle ABC are collinear.

We can extend the above idea of Simson’s line to any n-gon inscribed in a circle.

Theorem 8: A 3-dimensional space is divided into \frac{(n+1)(n^2-n+6)}{6} parts by n planes, each three of which intersect and no four of which have a common point.

Theorem 9: Given n spheres in 3-dimesnional space, each four of which intersect. Then all these spheres intersect, i. e. there exists a point belonging to all the spheres.

Theorem 10 (Young’s Theorem): Given n points in the plane such that each pair of them are at a distance of at most 1 from each other. Then all these points can be enclosed in a circle of radius 1/\sqrt{3}.

I won’t be discussing their proofs since the booklet containing the proofs and the detailed discussion is freely available at Mir Books.

Also, I would like to make a passing remark about the existence of a different kind of geometry system, called “finite geometry“. A finite geometry is any geometric system that has only a finite number of points. The familiar Euclidean geometry is not finite because a Euclidean line contains infinitely many points. A geometry based on the graphics displayed on a computer screen, where the pixels are considered to be the points, would be a finite geometry. While there are many systems that could be called finite geometries, attention is mostly paid to the finite projective and affine spaces because of their regularity and simplicity. You can learn more about it here:

Huntington’s Red-Blue Set


While reading Lillian Lieber’s book on infinity, I came across an astonishing example of infinite set (on pp. 207). Let’s call the property of existence of a rational number between given two rational number to be “beauty” (a random word introduced by me to make arguments clearer).

The set of rational numbers between 0 and 1 are arranged in ascending order of magnitude, and all of them are coloured blue. This is clearly a beautiful set. Then another another set of rational numbers between 0 and 1 is taken and arrange in ascending order of magnitude, but all of them are coloured red. This is also a beautiful set. Now, put these two sets together in such a way that each blue number is immediately followed by the corresponding red number. For example, 1/2 is immediately followed by 1/2 etc.  It appears that if we interlace two beautiful sets, the resulting set should be even more beautiful. But since each blue number has an immediate successor, namely the corresponding red number, so that between these two we can’t find even a single other rational number, red or blue, the resulting set is NOT beautiful.

The set created above is called Huntington’s Red-Blue set. It is an ingenious invention, where two beautiful sets combined together lead to loss of beauty. For more details, read the original paper:

Huntington, Edward V. “The Continuum as A Type of Order: An Exposition of the Modern Theory.” Annals of Mathematics, Second Series, 7, no. 1 (1905): 15-43. doi:10.2307/1967192.

Imaginary Angles


You would have heard about imaginary numbers and most famous of them is i=\sqrt{-1}. I personally don’t like this name because all of mathematics is man/woman made, hence all mathematical objects are imaginary (there is no perfect circle in nature…) and lack physical meaning. Moreover, these numbers are very useful in physics (a.k.a. the study of nature using mathematics). For example, “time-dependent Schrödinger equation

\displaystyle{i \hbar \frac{\partial}{\partial t}\Psi(\mathbf{r},t) = \hat H \Psi(\mathbf{r},t)}

But, as described here:

Complex numbers are a tool for describing a theory, not a property of the theory itself. Which is to say that they can not be the fundamental difference between classical and quantum mechanics (QM). The real origin of the difference is the non-commutative nature of measurement in QM. Now this is a property that can be captured by all kinds of beasts — even real-valued matrices. [Physics.SE]

For more of such interpretation see: Volume 1, Chapter 22 of “The Feynman Lectures in Physics”. And also this discussion about Hawking’s wave function.

All these facts may not have fascinated you, but the following fact from Einstein’s Special Relativity should fascinate you:

In 1908 Hermann Minkowski explained how the Lorentz transformation could be seen as simply a hyperbolic rotation of the spacetime coordinates, i.e., a rotation through an imaginary angle. [Wiki: Rapidity]

Irrespective of the fact that you do/don’t understand Einstein’s relativity, the concept of imaginary angle appears bizarre. But, mathematically its just another consequence of non-euclidean geometry which can be interpreted as Hyperbolic law of cosines etc. For example:

\displaystyle{\cos (\alpha+i\beta) = \cos (\alpha) \cosh (\beta) - i \sin (\alpha) \sinh (\beta)}

\displaystyle{\sin (\alpha+i\beta) = \sin (\alpha) \cosh (\beta) + i \cos (\alpha) \sinh (\beta)}

Let’s try to understand what is meant by “imaginary angle” by following the article “A geometric view of complex trigonometric functions” by Richard Hammack. Consider the complex unit circle  U=\{z,w\in \mathbb{C} \ :  \  z^2+w^2=1\} of \mathbb{C}^2, in a manner exactly analogous to the definition of the standard unit circle in \mathbb{R}^2. Apparently U is some sort of surface in \mathbb{C}^2, but it can’t be drawn as simply as the usual unit circle, owing to the four-dimensional character of \mathbb{C}^2. But we can examine its lower dimensional cross sections. For example, if  z=x+iy and w=u+iv then by setting y = 0 we get the circle x^2+u^2=1 in x-u plane for v=0 and the hyperbola x^2-v^2 = 1 in x-vi plane for u=0.


The cross-section of complex unit circle (defined by z^2+w^2=1 for complex numbers z and w) with the x-u-vi coordinate space (where z=x+iy and w=u+iv) © 2007 Mathematical Association of America

These two curves (circle and hyperbola) touch at the points ±o, where o=(1,0) in \mathbb{C}^2, as illustrated above. The symbol o is used by Richard Hammack because this point will turn out to be the origin of complex radian measure.

Let’s define complex distance between points \mathbf{a} =(z_1,w_1) and \mathbf{b}=(z_2,w_2) in \mathbb{C}^2 as


where square root is the half-plane H of \mathbb{C} consisting of the non-negative imaginary axis and the numbers with a positive real part. Therefore, the complex distance between two points in \mathbb{C}^2 is a complex number (with non-negative real part).

Starting at the point o in the figure above, one can move either along the circle or along the right-hand branch of the hyperbola. On investigating these two choices, we conclude that they involve traversing either a real or an imaginary distance. Generalizing the idea of real radian measure, we define imaginary radian measure to be the oriented arclength from o to a point p on the hyperbola, as


(a) Real radian measure (b) Imaginary radian measure. © 2007 Mathematical Association of America

If p is above the x axis, its radian measure is \beta i with \beta >0, while if it is below the x axis, its radian measure is \beta i with \beta <0. As in the real case, we define \cos (\beta i) and \sin (\beta i) to be the z and w coordinates of p. According to above figure (b), this gives

\displaystyle{\cos (\beta i) = \cosh (\beta); \qquad \sin (\beta i) = i \sinh (\beta)}

\displaystyle{\cos (\pi + \beta i) = -\cosh (\beta); \qquad \sin (\pi + \beta i) = -i \sinh (\beta)}

Notice that both these relations hold for both positive and negative values of \beta, and are in agreement with the expansions of  \cos (\alpha+i\beta)  and \sin (\alpha+i\beta)  stated earlier.

But, to “see” what a complex angle looks like we will have to examine the complex versions of lines and rays. Despite the four dimensional flavour, \mathbb{C}^2 is a two-dimensional vector space over the field \mathbb{C}, just like \mathbb{R}^2 over \mathbb{R}.

Since a line (through the origin) in \mathbb{R}^2 is the span of a nonzero vector, we define a complex line in \mathbb{C}^2 analogously. For a nonzero vector u in \mathbb{C}^2, the complex line \Lambda through u is span(u), which is isomorphic to the complex plane.

In \mathbb{R}^2, the ray \overline{\mathbf{u}} passing through a nonzero vector u can be defined as the set of all nonnegative real multiples of u. Extending this to \mathbb{C}^2 seems problematic, for the word “nonnegative” has no meaning in \mathbb{C}. Using the half-plane H (where complex square root is defined) seems a reasonable alternative. If u is a nonzero vector in \mathbb{C}, then the complex ray through u is the set \overline{\mathbf{u}} = \{\lambda u \ : \  \lambda\in H\}.

Finally, we define a complex angle is the union of two complex rays \overline{\mathbf{u}_1} and \overline{\mathbf{u}_2} .

I will end my post by quoting an application of imaginary angles in optics from here:

… in optics, when a light ray hits a surface such as glass, Snell’s law tells you the angle of the refracted beam, Fresnel’s equations tell you the amplitudes of reflected and transmitted waves at an interface in terms of that angle. If the incidence angle is very oblique when travelling from glass into air, there will be no refracted beam: the phenomenon is called total internal reflection. However, if you try to solve for the angle using Snell’s law, you will get an imaginary angle. Plugging this into the Fresnel equations gives you the 100% reflectance observed in practice, along with an exponentially decaying “beam” that travels a slight distance into the air. This is called the evanescent wave and is important for various applications in optics. [Mathematics.SE]

Different representations of a number as sum of squares


A couple of weeks ago, I wrote a post on Ramanujam’s 129th birthday. In that post I couldn’t verify the fact that:

129 is the smallest number that can be written as the sum of 3 squares in 4 ways.

So I contacted Sannidhya, the only good programmer I know . He wrote a program in Python which finds all numbers less than 1000 that can be written as sum of three squares. Here is the program :

Now, we can conclude that 129 is the smallest such number and the next is 134.

Let’s try to understand, how this program works. Firstly, the sumOfSquares(n) procedure finds all a, b such that n = a^2 + b^2. Then the sumOfSquares3(n) procedure finds all a,b,c such that n = a^2 + b^2 + c^2. It works by repeatedly invoking sumOfSquares on (n-i^2) where i is incremented after each iteration from 1 to square root of n/3. Finally, we run a loop up to 1000 and find those number for which sumOfSquares3 returns 4 triplets (a,b,c). Similarly, one can also find numbers which can be expressed as sum of 3 squares in 5 different ways. The smallest one is 194, with the 5 triplets being (1, 7, 12), (3, 4, 13), (3, 8, 11), (5, 5, 12) and (7, 8, 9).
But how does the functions sumOfSquares(n) and sumOfSquares3(n) work? This is how Sannidhya explains:

 The SumOfSquares(n) finds all the unordered pairs (a, b), such that a^2 + b^2 = n. It works by subtracting a perfect square (i^2) from n, and checking if the remaining part is a perfect square as well. If so, then (i, sqrt(n-i^2)) will form a required unordered pair. Here, i loops from 1 to square root of n/2. Note: One can also loop from 1 to square root of n but after square root of n/2, further iterations will generate redundant pairs which are just permutations of the pairs already obtained. For example, consider 25, the expected output should be (3,4) only, but if the loop runs from 1 to square root of n, then the output will be (3, 4), (4, 3). As you can see we are getting redundant pairs. So, we run the loop from 1 to square root of n/2.

The SumOfSquares3 function calls SumOfSquares repeatedly with the argument n – i^2, where i is incremented from 1 to square root of n/3. Note that each element of SumOfSquares(n – i^2) is a pair. For each of these elements, the loop forms a triplet consisting of i and the pair. This triplet is then appended to the list, which is finally returned.

The repetitions of triplets can easily be controlled by using sorted function from Python in sumOfSquares3(n).

Indeed, these type of question are a bit hard computationally. For example, see:

Related discussions on MathOverflow:

– Is there a simple way to compute the number of ways to write a positive integer as the sum of three squares? : Note that this is not answer of my question since r_k(n) counts the number of representations of n by k squares, allowing zeros and distinguishing signs and order.

– Efficient computation of integer representation as sum of three squares

Related discussions on ComputerScience.SE

– Listing integers as the sum of three squares m=x^2+y^2+z^2 : Sannidhya did a clever improvement to this algorithm, but still as pointed here, Sannidhya’s algorithm is of O(n).

Related discussions on Mathematics.SE

– When is a rational number a sum of three squares?

– Why can’t this number be written as a sum of three squares of rationals?

– Sum of one, two, and three squares

Not all numbers are computable


When we hear the word number, symbols like 1,\sqrt{2},¼, π (area enclosed by a unit circle), ι (symbol for \sqrt{-1}), ε (infinitesimal),  ω (ordinal infinity), ℵ (cardinal infinity), …. appear in our mind.  But not all numbers are  computable:

A computable number [is] one for which there is a Turing machine which, given n on its initial tape, terminates with the nth digit of that number [encoded on its tape].

In other words, a real number is called computable if there is an algorithm which, given n, returns the first n digits of the number. This is equivalent to the existence of a program that enumerates the digits of the real number. For example, π is a computable number (why? see here).

Using Cantor’s diagonal argument on a list of all computable numbers, we get a non-computable number (here is the discussion). For example, a sum of series of real numbers called Chaitin’s constant, denoted by Ω, is a non-computable number (why? see here).

Fun fact: We don’t know whether π is a normal number or not (though we want it to be a normal number), but Ω is known to be a normal number (just like the Mahler’s Number discussed here).

Packing Problems


Easy to state and hard to solve problems make mathematics interesting. Packing problems are one such type. In fact there is a very nice Wikipedia article on this topic:

 Packing problems involve attempting to pack objects together into containers. The goal is to either pack a single container as densely as possible or pack all objects using as few containers as possible.

I came across this problem a couple of years ago, while watching the following TED Talk by Eduardo Sáenz de Cabezón:

In this talk, the object of attraction is the Weaire-Phelan structure made from six 14-hedrons and two dodechedrons. This object is believed to be the solution of Kelvin’s problem:

How you can chop 3d space into cells of equal volume with the minimum surface area per cell?

The packing problem for higher dimensions was in news last year, since a problem about “Densest Packing Problem in Dimensions 8 and 24” was solved by a young mathematician (Maryna Viazovska). The mathematics involved in the solution is very advanced but we can start gaining knowledge from this book:


A classic reference in this field by two well known geniuses.

Recently, while reading Matt Parker’s book, I discovered a wonderful website called Packomania by Eckard Specht (Otto-von-Guericke-Universität Magdeburg) containing data about packing problems in 2D and 3D. (also checkout his website, it’s a good reference for elementary triangle geometry and inequalities problems.)


Screen-shot of Dr. Eckard Specht’s homepage, his online problem collection (with solutions; each problem in GIF, PS and PDF formats) and Packomania.

Computers also have a role to play in solving such optimization problems For example, in 2015 Thomas Hales formally proved Kepler’s Conjecture about 3D packing:

No arrangement of equally sized spheres filling space has a greater average density than that of the cubic close packing (face-centered cubic) and hexagonal close packing arrangements.

using HOL Light proof assistant.

The proof was a huge collaboration and was called Flyspeck Project. The details about this proof are available in this book:


NOTE: You can construct your own Truncated octahedron and Weaire-Phelan polyhedra by printing the nets available at Matt Parker’s website.