Category Archives: Problem Solving

Discrete Derivative


I came across the following interesting question in the book “Math Girls” by Hiroshi Yuki :

Develop a definition for the differential operator \Delta in discrete space,corresponding to the definition of the differential operator D in the continuous space.

We know that derivative of a function f at point x is the rate at which f changes at the point x . Geometrically, derivative at a point x is the slope of the tangent to the function f at x where a tangent is the limit of the secant lines as shown below :


But this happens only in the continuous world where x “glides smoothly” from one point to another. But this is not the case in a discrete world. In the discrete world there is nothing like “being close to each other”. Hence we cannot use the earlier definition of bringing h arbitrarily close to x. In a discrete world we cannot talk about getting “close” to something but instead we can talk about being “next” to each other.


We can talk about the change in x as it moves from x to x+1 while f changes from f(x) to f(x+1). We do not need limits here, so the definition of “difference operator” (analogous to differential operator ) will be :

\Delta f(x) = \frac{ f(x+1) - f(x) }{(x+1) - x} = f(x+1) - f(x)

Hence to find derivative of a function, say g(x) = x^2 , it is easy to verify that Dg(x) = 2x but \Delta g(x) = 2x + 1 (using definitions mentioned above)

Now, when will we be able to get the same derivative in both discrete and continuous worlds? I read a little about this question in math girls and a little more in “An introduction to the calculus of finite differences” by C.H.Richardson.

Calculus of differences is the study of the relations that exist between the values assumed by the function whenever the independent variable takes on a series of values in arithmetic progression.

Let us write f(x) as f_x instead from now onwards. So f(x+1) - f(x) = \Delta f(x) = f_{x+1} - f_x. Using above definition we can prove the following for functions U_x and V_x :

1) \Delta^{k+1} U_x = \Delta^{k} U_{x+1} - \Delta^{k} U_x

2) \Delta (U_x + V_x) = \Delta U_x + \Delta V_x (or) \Delta^k (U_x + V_x) =\Delta^k U_x + \Delta^k V_x

3) \Delta^k (cU_x) = c \Delta^k U_x

Theorem. \Delta^n x^n = n!

Proof. \Delta x^n = (x+1)^n - x^n = n\cdot x^{n-1} + \text{terms of degree lower than} (n - 1). Each repetition of the process of differencing reduces the degree by one and also adds one factor to the succession n(n - 1) (n - 2) \cdots. Repeating the process n times we have \Delta^k x^n = n!.

Corollary 1. \Delta^n ax^n = a\cdot n!

Corollary 2. \Delta^{n+1} x^n = 0

Corollary 3. If U_x is a polynomial of degree n i.e. U_x= a_0+ a_1 x + a_3 x + \ldots + a_n x^n , then \Delta^n U_x = a_n\cdot n!.

We call the continued products U_x^{|n|} = U_x\cdot U_{x+1}\cdot U_{x+2} \cdots U_{x+(n-1)} and U_x^{(n)} = U_x \cdot U_{x-1}\cdot U_{x-2}\cdots U_{x-(n-1)} as factorial expressions.

If U_x is the function ax+b for some real numbers a and b, then the factorial forms we get by replacing U_x by ax+b is (ax+b)^{|n|} = (ax+b)\cdot(a(x+1)+b)\cdot (a(x+2)+b)\cdots (a(x+n-1)+b) and (ax+b)^{(n)} =(ax+b)\cdot (a(x-1)+b)\cdot (a(x-2)+b)\cdots (a(x-(n-1))+b).

We define (ax+b)^{|0|} and (ax+b)^{(0)} as 1.

Using the above definition of factorial we can show the following :

(i) \Delta (ax+b)^{(n)} = a\cdot n \cdot (ax+b)^{(n-1)}

(ii) \displaystyle{\Delta \frac{1}{(ax+b)^{|n|}} = \frac{-an}{(ax+b)^{|n+1|}}}

When we consider the special case of a=1 and b=0, the factorial representations are called raising and falling factorials :

x^{|n|} = x \cdot (x+1)\cdot (x+2)\cdots (x+n-1) – rising factorial

x^{(n)} =x\cdot (x-1) \cdot (x-2) \cdots (x-n+1) – falling factorial.

Substituting a=1 and b=0 in (i) and (ii) above , we get that

\Delta x^{(n)} = n\cdot x^{(n-1)} , \Delta^n x^{(n)} = n! and \displaystyle{\Delta \frac{1}{x^{|n|}} = - \frac{n}{x^{|n+1|}}}.



source: Richardson, C. H. An introduction to the calculus of finite differences. pp. 10.

Due to the fact that x^{(n)} plays in the calculus of finite differences a role similar to that played by x^n in the infinitesimal calculus, for many purposes in finite differences it is advisable to express a given polynomial in a series of factorials. A method of accomplishing this is contained in Newton’s Theorem.


source: Richardson, C. H. An introduction to the calculus of finite differences. pp. 10.

Since these differences and U_x are identities, they are true for all values of x, and consequently must hold for x = 0. Setting x = 0 in the given function and the differences, we have the required values for all a_i and theorem is proved.


Enclosing closed curves in squares


Let’s look at the following innocent looking question:

Is it possible to circumscribe a square about every closed curve?

The answer is YES! I found an unexpected and interesting proof in the book “Intuitive Combinatorial Topology ” by V.G. Boltyanskii and V.A. Efremovich . Let’s now look at the outline of proof for our claim:

1. Let any closed curve K be given. Draw any line l and the line l’ such that line l’ is parallel to l as shown in the fig 1.


2. Move the lines l and l’ closer to K till they just touch the curve K as shown in fig 2. Let the new lines be line m and line m’. Call these lines as the support lines of curve K with respect to line l.


3. Draw a line l* perpendicular to l and the line (l*)’ parallel to l* . Draw support lines with respect to line l* to the curve K as shown in the fig 3. Let the rectangle formed be ABCD .


4. The rectangle corresponding to a line will become square when AB and AD are equal . Let the length of line parallel to l (which is AB)  be h_1(\mathbf{l}) and line perpendicular to l (which is AD) be h_2(\mathbf{l}). For a given line n, define a real valued function f(\mathbf{n}) = h_1(\mathbf{n})-h_2(\mathbf{n}) on the set of lines lying outside the curve .  Now rotate the line l in an anti-clockwise direction till l coincides with l’. The rectangle corresponding to l* will also be ABCD (same as that with respect to l). When l coincides with l’, we can say that  AB = h_2(\mathbf{l^*}) and AD = h_1(\mathbf{l^*}).


5. We can see that when the line is lf(\mathbf{l}) = h_1(\mathbf{l})-h_2(\mathbf{l}). When we rotate l in an anti-clockwise direction the value of the function f changes continuously i.e. f is a continuous function (I do not know how to “prove” this is a continuous function but it’s intuitively clear to me; if you can have a proof please mention it in the comments). When l coincides with l’ the value of f(\mathbf{l^*}) = h_1(\mathbf{l^*})-h_2(\mathbf{l^*}). Since h_1(\mathbf{l^*}) = h_2(\mathbf{l}) and h_2(\mathbf{l^*}) = h_1(\mathbf{l}). Hence f(\mathbf{l^*}) = -(h_1(\mathbf{l}) - h_2(\mathbf{l})). So f is a continuous function which changes sign when line is moved from l to l’. Since f is a continuous function, using the generalization of intermediate value theorem we can show that there exists a line p between l and l* such that f(p) = 0 i.e. AB = AD.  So the rectangle corresponding to line p will be a square.

Hence every curve K can be circumscribed by a square.

Rooms and reflections


Consider the following entry from my notebook (16-Feb-2014):

The Art Gallery Problem: An art gallery has the shape of a simple n-gon. Find the minimum number of watchmen needed to survey the building, no matter how complicated its shape. [Source: problem 25, chapter 2, Problem Solving Strategies, Arthur Engel]

Hint: Use triangulation and colouring. Not an easy problem, and in fact there is a book dedicated to the theme of this problem: Art Gallery Theorems and Algorithms by Joseph O’Rourke (see chapter one for detailed solution). No reflection involved.

Then we have a bit harder problem when we allow reflection (28-Feb-2017, Numberphile – Prof. Howard Masur):

The Illumination Problem: Can any room (need not be a polygon) with mirrored walls be always illuminated by a single point light source, allowing for the repeated reflection of light off the mirrored walls?

The answer is NO. Next obvious question is “What kind of dark regions are possible?”. This question has been answered for rational polygons.

This reminds me of the much simpler theorem from my notebook (13-Jan-2014):

The Carpets Theorem: Suppose that the floor of a room is completely covered by a collection of non-overlapping carpets. If we move one of the carpets, then the overlapping area is equal to the uncovered area of the floor. [Source: §2.6, Mathematical Olympiad Treasures, Titu Andreescu & Bogdan Enescu]

Why I mentioned this theorem? The animation of Numberphile video reminded me of carpets covering the floor.

And following is the problem which motivated me write this blog post (17-May-2018, PBS Infinite Series – Tai-Danae):

Secure Polygon Problem: Consider a n-gon with mirrored walls, with two points: a source point S and a target point T. If it is possible to place a third point B in the polygon such that any ray from the source S passes through this point B before hitting the target T, then the polygon is said to be secure. Is square a secure polygon?

The answer is YES.  Moreover, the solution is amazing. Reminding me of the cross diagonal cover problem.

New Proofs on YouTube


Earlier, YouTube maths channels focused mainly on giving nice expositions of non-trivial math ideas. But recently, two brand new theorems were presented on YouTube instead of being published in a journal.

  • Proofs of the fact that \sqrt{2}, \sqrt{3}, \sqrt{5} and \sqrt{6} are irrational numbers – Burkard Polster (13 April 2018)

This is an extension of the idea discussed in this paper by Steven J. Miller and David Montague.

  • A new proof of the Wallis formula for π – Sridhar Ramesh and Grant Sanderson (20 Apr 2018):

This is an extension of Donald Knuth‘s idea documented here by Adrian Petrescu.

It’s nice to see how the publishing in maths is evolving to be accessible to everyone.


Probability Musing


Please let me know if you know the solution to the following problem:

What is the probability of me waking up at 10am?

What additional information should be supplied so as to determine the probability? What do you exactly mean by the probability of this event? Which kind of conditional probability will make sense?

Consider the following comment by Timothy Gowers regarding the model for calculating the probability of an event involving a pair of dice:


Rolling a pair of dice (pp. 6), Mathematics: A very short introduction © Timothy Gowers, 2002 [Source]

I find probability very confusing, for example, this old post.

Conway’s Prime Producing Machine


Primes are not randomly arranged (since their position is predetermined) but we can’t find an equation which directly gives us nth prime number. However, we can ask for a function (which surely can’t be a polynomial) which will give only the prime numbers as output. For example, the following one is used for MRDP theorem:


But it’s useless to use this to find bigger primes because the computations are much more difficult than the primality tests.

Conway’s PRIMEGAME takes whole numbers as inputs and outputs 2^k if and only if k is prime.


Source: [Richard Guy, © 1983 Mathematical Association of America]

 PRIMEGAME is based on a Turing-complete esoteric programming language called FRACTRAN, invented by John Conway. A FRACTRAN program is an ordered list of positive fractions together with an initial positive integer input n. The program is run by updating the integer n as follows:

  1. for the first fraction f in the list for which nf is an integer, replace n by nf;
  2. repeat this rule until no fraction in the list produces an integer when multiplied by n, then halt.

PRIMEGAME is an algorithm devised to generate primes using a sequence of 14 rational numbers:

\displaystyle{\left( \frac{17}{91}, \frac{78}{85}, \frac{19}{51}, \frac{23}{38}, \frac{29}{33}, \frac{77}{29}, \frac{95}{23}, \frac{77}{19}, \frac{1}{17}, \frac{11}{13}, \frac{13}{11}, \frac{15}{2}, \frac{1}{7}, \frac{55}{1} \right)}

Starting with 2, one finds the first number in the machine that multiplied by 2 gives an integer; then for that integer we find the first number in the machine that generates another integer. Except for the initial 2, each number output have an integer for a binary logarithm is a prime number, which is to say that powers of 2 with composite exponents don’t show up.

If you have some knowledge of computability and unsolvability theory, you can try to understand the working of this Turing machine. There is a nice exposition on OeisWiki  to begin with.


“Hilbert’s 10th Problem” by Martin Davis and Reuben Hersh [© 1973 Scientific American, doi: 10.1038/scientificamerican1173-84] Illustrating the basic idea of machines from unsoilvability theory.

Following is an online program by Prof. Andrew Granville illustrating the working of PRIMEGAME:

Motivation for this post came from Andrew Granville’s Math Mornings at Yale.

Generalization of Pythagoras equation


About 3 years ago I discussed following two Diophantine equations of degree 2:

In this post, we will see a slight generalization of the result involving Pythagorean triplets. Unlike Pythagoras equation, x^2+y^2-z^2=0, we will work with a little bit more general equation, namely: ax^2+by^2+cz^2=0, where a,b,c\in \mathbb{Z}. For proofs, one can refer to section 5.5 of Niven-Zuckerman-Montgomery’s An introduction to the theory of numbers.

Theorem: Let a,b,c\in \mathbb{Z} be non-zero integers such that the product is square free. Then ax^2+by^2+cz^2=0 have a non-trivial solution in integers if and only if a,b,c do not have same sign, and that -bc, -ac, -ab are quadratic residues modulo a,b,c respectively.

In fact, this result helps us determine the existence of a non-trivial solution of any degree 2 homogeneous equation in three variables, f(X,Y,Z)=\alpha_1 X^2 +\alpha_2Y^2+\alpha_3Z^2+\alpha_4XY+\alpha_5YZ+\alpha_6ZX due to the following lemma:

Lemma: There exists a sequence of changes of variables (linear transformations) so that f(X,Y,Z) can be written as an equation of the form g(x,y,z)=ax^2+by^2+cz^2 with \gcd(a,b,c)=1.

Now let’s consider the example. Let f(x,y,z)=3x^2+5y^2+7z^2+9xy+11yz+13zx, and we want to determine whether this f(x,y,z)=0 has a non-trivial solution. Firstly, we will do change of variables:

\displaystyle{f(x,y,z)=3\left(x+\frac{3}{2}y +\frac{13}{6}z\right)^2 - \frac{7}{4}y^2 - \frac{85}{12}z^2 - \frac{17}{2}yz = g(x',y',z')}

where x' = x+\frac{3}{2}y +\frac{13}{6}z, y'=y and z'=z. Thus

\displaystyle{12g(x',y',z')=36x'^2 - 21 y'^2 - 85z'^2 - 102y'z' = 36x'^2 - 21\left(y'+\frac{17}{7}z'\right)^2+\frac{272}{7}z'^2=h(x'',y'',z'')}

where x'' = x',y'' = y'+\frac{17}{7}z' and z''=z'. Thus

\displaystyle{7h(x''',y'',z'') = 252x''^2 - 147y''^2+272z''^2=7(6x'')^2-3(7y'')^2 + 17(4z'')^2 = F(X,Y,Z)}

where X=6x'', Y=7y'' and Z=4z''. Now we apply the theorem to 7X^2-3Y^2+17Z^2=0. Since all the coefficients are prime numbers, we can use quadratic reciprocity to conclude that the given equation has non-trivial solution (only non trivial thing to note that -7\times 17 is quadratic residue mod -3, is same as -7\times 17 is quadratic residue mod 3).