\ie\cf
\newcommand{\eqdef}{\mathbin{\stackrel{\rm def}{=}}}  \newcommand{\R} % real numbers  \newcommand{\N}} % natural numbers  \newcommand{\Z} % integers  \newcommand{\F} % a field  \newcommand{\Q} % the rationals  \newcommand{\C}{\mathbb{C}} % the complexes  \newcommand{\poly}}  \newcommand{\polylog}}  \newcommand{\loglog}}}  \newcommand{\zo}{\{0,1\}}  \newcommand{\suchthat}  \newcommand{\pr}[1]{\Pr\left[#1\right]}  \newcommand{\deffont}{\em}  \newcommand{\getsr}{\mathbin{\stackrel{\mbox{\tiny R}}{\gets}}}  \newcommand{\Exp}{\mathop{\mathrm E}\displaylimits} % expectation  \newcommand{\Var}{\mathop{\mathrm Var}\displaylimits} % variance  \newcommand{\xor}{\oplus}  \newcommand{\GF}{\mathrm{GF}}  \newcommand{\eps}{\varepsilon}  \notag
\no
Back to MB3

Week 12 Notes

100%

In weeks 1-3, we studied permutations, combinations, and the many things one can do with them. In this week's notes, we'll be going over some more advanced topics in combinatorics that certainly are useful, but show up less frequently.

Recurrence Relations

Recurrence Relations are relationships between elements of a sequence. As a common example, consider the recurrence relation for the Fibonacci numbers.$$f_{n+1}=f_n+f_{n-1}$$

The Fibonacci numbers are defined by the recurrence relation above, and the following values: \(f_0=0\), \(f_1=1\). We call these starting values the initial conditions of the recurrence.

Can you list out the first \(5\) Fibonacci numbers?

The Fibonacci recurrence is a special type of recurrence relation known as a linear recurrence, which is of the following form: $$a_n=c_1a_{n-1}+c_2a_{n-2}+\ldots+c_ka_{n-k}$$These recurrences are called linear for the same reason linear polynomials are called linear: all of the terms have degree \(1\). Note that for the recurrence above, we need \(k\) initial conditions in order to determine the sequence. Can you see why?

We care a lot about linear recurrences since we have a great way of solving them.

Characteristic Polynomial

Let's take our general linear recurrence from before.

$$a_n=c_1a_{n-1}+c_2a_{n-2}+\ldots+c_ka_{n-k}$$

For those of you who know linear algebra, you'll know that the dimension of the space of solutions to this recurrence is at most \(k\) (as \(k\) initial conditions determines the sequence). So, we simply need to find \(k\) independent solutions to this recurrence, and thus all other solutions can be written in terms of these \(k\) base solutions.

But how do we find the \(k\) base solutions? We guess that the sequence is a geometric sequence. In other words, we guess that \(a_n=x^n\) for some \(x\neq0\). Plugging this guess in, we get $$x^n-c_1x^{n-1}-c_2x^{n-2}+\ldots-c_kx^{n-k}=0$$Since \(x\neq0\), we can divide both sides by \(x^{n-k}\) to get $$x^k-c_1x^{k-1}-\ldots-c_k=0$$

This is a degree \(k\) polynomial! We call this polynomial the characteristic polynomial of the linear recurrence. Thus, there are \(k\) roots, which we call \(x_1,\ldots x_k\). And, \(a_n=x_i^n\) is a valid solution to our recurrence for each \(1\leq i\leq k\). That means that every solution is of the form $$a_n=b_1x_1^n+b_2x_2^n+\ldots+b_kx_k^n$$for constants \(b_1,\ldots,b_k\).

We call the equation above the general form of the solution to the linear recurrence. To find the specific values of \(b_1,\ldots,b_k\) for our recurrence, we plug in our initial conditions, and we get a system of \(k\) equations and \(k\) variables, which we can solve. Let's do an example to show how this is done.

The fibonacci recurrence is \(f_n=f_{n-1}+f_{n-2}\). With our guess of \(f_n=x^n\), we have $$x^n-x^{n-1}-x^{n-2}=0$$Dividing by \(x^{n-2}\), we have $$x^2-x-1=0$$Using the quadratic formula, we get $$x=\frac{1\pm\sqrt5}2$$So, the general form of the solution to the Fibonacci recurrence is $$f_n=b_1\left(\frac{1+\sqrt5}2\right)^n+b_2\left(\frac{1-\sqrt5}2\right)^n$$To find the values of \(b_1,b_2\), we plug in our initial conditions of \(f_0=0,f_1=1\). $$0=b_1\left(\frac{1+\sqrt5}2\right)^0+b_2\left(\frac{1-\sqrt5}2\right)^0$$$$1=b_1\left(\frac{1+\sqrt5}2\right)^1+b_2\left(\frac{1-\sqrt5}2\right)^1$$Solving this linear system gives us \(b_1=\frac1{\sqrt5}\), \(b_2=-\frac1{\sqrt5}\). So, the formula for the Fibonacci Numbers is $$f_n=\frac1{\sqrt5}\cdot\left(\left(\frac{1+\sqrt5}2\right)^n-\left(\frac{1-\sqrt5}2\right)^n\right)$$

So that gives us a way to solve for any solution to a linear recurrence, right? Well, not so fast. What if the polynomial has multiple roots? In other words, what if it has a root multiple times?

In that case, we have the following theorem.

If the characteristic polynomial of a recurrence \(a_n\) has the root \(r\) \(l\) times, then all of the following are solutions to the recurrence$$a_n=r^n,\;nr^n,\;n^2r^n,\ldots,\;n^{l-1}r^n$$

This solves our problem, since it guarantees that we will have \(k\) solutions to our recurrence. Here is an example of a multiple-root recurrence.

Let's say we want to find the general form for the recurrence \(a_n=7a_{n-1}^2-16a_{n-2}+12a_{n-3}\). By our technique, we find the characteristic polynomial to be $$x^3-7x^2+16x-12=(x-3)(x-2)^2$$So, the general form of the solution to the recurrence is $$a_n=b_13^n+b_22^n+b_3n2^n$$

For some real numbers \(a,x,y\), let \(s_n=ax^n+by^n\). Suppose \(s_1=3\), \(s_2=7\), \(s_3=16\), \(s_4=42\). Find \(s_5\).


As you might expect, \(s_n\) can be written as a linear recurrence. To find out how, let's try to write \(s_n\) in terms of other \(s\) terms. Start with \(s_{n-1}=ax^{n-1}+by^{n-1}\), and multiply it by \(x+y\) to get $$(x+y)s_{n-1}=ax^n+by^n+ax^{n-1}y+by^{n-1}x$$$$=s_n+xys_{n-2}$$Rewriting this gives us $$s_n-(x+y)s_{n-1}+xys_{n-2}=0$$Plugging in our known values gives us the equations $$16=7(x+y)-3xy,\;42=16(x+y)-7xy$$Treating \(x+y\) and \(xy\) as our variables, this is a linear system, so we get $$x+y=-14,\;xy=-38$$So, $$s_5=-14s_4+38s_3=20$$
Suppose you are located at point \(k\) on the real number line, where \(k\) is a positive integer less than \(N\). Every minute, you flip a coin, and if it is heads, you move right by \(1\) unit, and otherwise, you move left by \(1\) unit. The game stops when you either reach \(0\) or \(N\). What is the probability you reach \(N\), instead of \(0\)?


Let \(P_i\) for each \(0\leq i\leq N\) represent the probability of hitting \(N\) instead of \(0\) if you start at position \(i\). Thus, \(P_0=0\), \(P_N=1\). And, our recurrence is $$P_i=\frac12P_{i+1}+\frac12P_{i-1}$$Thus, the characteristic equation for this recurrence is \(\lambda^2-2\lambda+1\), which has \(1\) as a double root. So, the general form for the solution is \(P_i=A+Bi\). Plugging in \(P_0=0\) gives us \(A=0\), and \(P_N=1\) gives us \(B=\frac1N\). Thus, our solution is $$P_k=\frac kN$$

Non-linear recurrences

What do we do if the recurrence is not linear? We expand out a few terms and hope we can see a pattern. Let's do an example.

Suppose \(a_0=0\) and \(a_n=2a_{n-1}+1\). Find a formula for \(a_n\). This may seem linear, but linear recurrences don't have constant terms like \(1\). So what do we do?

Let's try to expand the recurrence a bit to see if we see a pattern.$$a_n=2a_{n-1}+1=2(2a_{n-2}+1)+1=4a_{n-2}+2+1$$$$=8a_{n-3}+4+2+1$$Can you see a pattern yet? Let's use the pattern to fully expand the recurrence$$a_n=2^na_0+2^{n-1}+\ldots+1=\sum\limits_{i=0}^{n-1}2^i=2^n-1$$

There are no general rules for solving all recurrences, so if you see a non-linear recurrence on a contest, you can be sure that there's some nice pattern waiting for you if you expand it.

Induction

The principle of mathematical induction, otherwise known as just induction or PMI, is confusing for how simple it is.

Before explaining what PMI is, let's do an example.

For any positive natural number \(n\),$$S_n=\sum\limits_{i=1}^n(2i-1)=1+3+\ldots+2n-1=n^2$$


  • Base Case: If \(n=1\), \(S_n=1\), and \(1^2=1\), so \(S_1=1^2\).
  • Inductive Case: Suppose \(S_{n-1}=\sum\limits_{i=1}^{n-1}2n-1=(n-1)^2\). Then, $$S_n=\sum\limits_{i=1}^n2n-1=2n-1+S_{n-1}=2n-1+n^2-2n+1=n^2$$

How does the above proof work? It works as follows:

  1. The base case tells us \(S_1=1^2\).
  2. Then, since \(2-1=1\), the inductive case tells us that \(S_2=2^2\).
  3. Since \(3-1=2\), the inductive case tells us that \(S_3=3^2\)

Do you see the pattern here? The idea behind induction is like that of dominoes. If I line them up really nicely so that each domino will knock over the next (the inductive case), then we only need to knock over the first domino (the base case) for all of the dominoes to fall.

Induction allows us to prove a lot of interesting results. Here are a few more examples of useful induction proofs to give you a flavor of what they look like.

For any natural number \(n\),$$(1+2+\ldots+n)^2=1^3+2^3+\ldots+n^3$$


  • Base Case: If \(n=0\), both sides of the equation are \(0\), and thus equal.
  • Inductive Case: Suppose \((1+2+\ldots+n-1)^2=1^3+2^3+\ldots+(n-1)^3\). Then, $$(1+2+\ldots+n)^2=(1+2+\ldots+n-1)^2+2n(1+2+\ldots+n-1)+n^2$$$$=(1^3+\ldots+(n-1)^3)+2n\cdot\frac{n(n-1)}2+n^2$$$$=(1^3+\ldots+(n-1)^3)+n^3-n^2+n^2=1^3+\ldots+n^3$$
For all natural numbers \(n\), \(x^{2n+1}+y^{2n+1}\) has \(x+y\) as a factor.


  • Base Case: When \(n=0\), \(x^{2n+1}+y^{2n+1}=x+y\), so \(x+y\) obviously has \(x+y\) as a factor.
  • Inductive Case: Suppose \(x^{2n-1}+y^{2n-1}\) has \(x+y\) as a factor. Then, by modular arithmetic, $$x^{2n-1}+y^{2n-1}\equiv0\bmod{x+y}$$By mod properties, $$x^{2n-1}\equiv-y^{2n-1}\bmod{x+y}$$Since \(x^2-y^2=(x+y)(x-y)\), \(x^2-y^2\equiv0\bmod{x+y}\), so by mod properties, $$x^{2n-1}\cdot x^2\equiv-y^{2n-1}\cdot y^2\bmod{x+y}$$$$x^{2n+1}+y^{2n+1}\equiv0\bmod{x+y}$$
Prove the exercise from the Challenge Section in Week 3 that requires induction.

There are many variants of induction. The most useful variant is called Strong Induction. In this variant, we prove the following steps for our property \(P\).

Interestingly, for strong induction, you don't need a base case (why?). It is called strong, since instead of assuming that the property is true just for the natural number just below \(n\), we assume it is true for all natural numbers below \(n\). This can be useful in some induction proofs, like the following.

All natural numbers \(\gt1\) can be written as a product of one or more primes.


  • Inductive Case: Given \(n\gt1\), suppose that every natural number \(m\) with \(1\lt m\lt n\) can be written as a product of primes. There are two possibilities.
    1. There exists a natural number \(a\) such that \(1\lt a\lt n\) and \(a\) is a factor of \(n\). In this case, \(b=\frac na\) is also a natural number. Since \(a\gt1\), \(b\lt n\), and since \(a\lt n\), \(b\gt1\). So, both \(a\) and \(b\) can be written as a product of primes. So \(n\) can be written as the product of \(a\)'s primes times the product of \(b\)'s primes.
    2. If Case 1 is not true, then by the definition of prime numbers, \(n\) is prime, and thus is a product of prime numbers.

This proof would have been near impossible to do with just vanilla induction, but strong induction is good enough to prove the result. Here are some exercises to test your understanding.

Prove the follow equation is true for all natural numbers \(n\) without Euler's Formula$$(\cos\theta+i\sin\theta)^n=\cos n\theta+i\sin n\theta$$Hint: remember the formulas for \(\cos(a+b)\) and \(\sin(a+b)\).

Prove that the following statements hold for all natural numbers \(n\)$$f_{n+1}\lt\left(\frac74\right)^n$$$$f_{2n+2}\text{ is a multiple of }f_{n+1}$$$$\gcd(f_{n+2},f_{n+1})=1$$

Prove the Binomial Theorem from Week 3 for all natural numbers \(n\)$$(x+y)^n=\sum\limits_{k=0}^n{n\choose k}x^ky^{n-k}$$

There is another variation of induction known as infinite descent. In essence, infinite descent is the principle that there is no infinite sequence of decreasing natural numbers. The first known use of this property is in the proof of the irrationality of \(\sqrt2\).

\(\;\sqrt2\) is irrational.


Suppose \(\sqrt2=\frac {a_1}{b_1}\) for some positive integers \(a_1,b_1\). Then, \(a_1^2=2b_1^2\), which implies \(a_1\) is even. Thus, \(a_1=2a_2\) for some positive integer \(a_2\). This means \(4a_2^2=2b_1^2\), which implies \(b_1\) is even, and thus \(b_1=2b_2\) for some positive integer \(b_2\). Repeating this process ad nauseam gives us an infinite sequence of decreasing positive integers, which is a contradiction. Thus, there is no rational number that equals \(\sqrt2\).

Now for some more interesting applications.

Prove that the equation \(a^2+b^2=3c^2\) has no positive integer solutions


Suppose \(a_1,b_1\) and \(c_1\) form a positive integer solution. Modding both sides by \(3\) gives us \(a_1^2+b_1^2\equiv0\bmod3\). Note that if \(a_1\neq0\bmod3\), then \(a_1^2\equiv1\bmod3\), which means \(a_1^2+b_1^2\equiv1,2\bmod3\). Thus, \(a_1\equiv0\bmod3\), and similarly, \(b_1\equiv0\bmod3\). So, there exists positive integers \(a_2,b_2\) such that \(a_1=3a_2\), \(b_1=3b_2\). Rewriting our equation gives us $$9a_2^2+9b_2^2=3c_1^2$$This implies \(c_1\) is a multiple of \(3\), and thus there exists a positive integer \(c_2\) such that \(c_1=3c_2\). Repeating this process ad nauseam gives us a infinite sequence of decreasing positive integers, which is a contradiction. Thus, there are no solutions in the positive integers.
Let \(a_1,\ldots,a_{2n+1}\) be a set of integers such that if any one is removed, the remaining ones can be separated into two sets of equal size with equal sums. Prove that $$a_1=a_2=\ldots=a_{2n+1}$$


We can assume that $$a_1\leq a_2\leq\ldots\leq a_{2n+1}$$If we subtract \(a_1\) from each number, the property still holds, so we have $$0\leq a_2-a_1\leq\ldots\leq a_n-a_1$$So, we can assume that all of the integers are not negative, and we want to show that they are all zero. Suppose that \(a_{2n+1}\neq0\).

Let \(S_1=a_1+\ldots+a_{2n+1}\). Since \(a_1=0\), by the property that the remaining elements can be split into sets of equal sums, \(S_1\) is even. And, since we can remove any element, and the resulting sum is even, every \(a_i\) is even. So, let \(b_i=\frac{a_i}2\) be positive integers. Clearly, the given property holds with \(S_2=b_1+b_2+\ldots+b_{2n+1}\). Repeating this process ad nauseam gives us an infinite sequence of decreasing non-negative integers \(a_{2n+1},b_{2n+1},\ldots\), which is a contradiction. Thus, \(a_{2n+1}=0\), which implies all of the \(a_i\) are zero.

Pigeonhole Principle

The pigeonhole principle is another example of an idea that's very simple, but can be used in very interesting ways.

Imagine that there are 10 pigeons, and 9 pigeonholes (areas for pigeons to roam). If each pigeon needs to go into a hole, then clearly at least one hole has more than one pigeon.

This argument generalizes as follows. If I have \(n\) pigeons, and \(k\) holes, then there is a hole with at least \(\left\lfloor\frac nk\right\rfloor\) pigeons (\(\lfloor\rfloor\) is the floor function, which returns the largest integer less than the input). Let's take a look at some examples of this principle in use.

We first used the principle in Week 8, in the proof of Bezout's Lemma. For those of you who didn't know this principle before this week, I'd suggest revisiting those notes to understand that proof. Here's a more fun application of the principle:

The census estimates that there are more than 30 million people in California who are not bald. And, every human has at most 1 million hairs on their head. By the pigeonhole principle, there are at least \(30\) Californians who are not bald and have the same number of hairs on their head.

Weird, huh? Let's look at some non-trivial applications.

Given any \(10\) integers in the range \([1,100]\), there exists two different non-empty subsets of the \(10\) integers with the same sum.


Given any subset of the \(10\) integers, the maximum sum of that subset is \(10\cdot100=1000\), and the minimum is \(1\cdot1=1\). So, there are \(1000\) possibilities for the sum. And, there are \(2^{10}-1=1023\) possible subsets of the \(10\) integers. Since \(1023\gt1000\), there exists two different non-empty subsets with the same sum!
Every polyhedron has at least two faces with the same number of edges.


Let \(N\) be the maximum number of edges a face on our polyhedron has. Then, the face with \(N\) edges is adjacent to \(N\) faces, and the number of edges they have can range from \(3\) to \(N\). Since there are at least \(N+1\) faces and \(N-2\) possible number of edges, by the pigeonhole principle, at least two faces have the same number of edges.
For any positive natural number \(n\), there exists a positive Fibonacci number \(f_k\) such that \(n\) is a factor of \(f_k\).


For each natural number \(k\), consider the pair \((f_k\bmod n,f_{k+1}\bmod n)\). There are infinitely many such pairs (since there are infinitely many natural numbers), but only \(n^2\) possible values for the pairs (since there are \(n\) possible remainders \(\bmod n\)). So, there exists naturals \(i,j\) such that \(f_i\equiv f_j\bmod n\) and \(f_{i+1}\equiv f_{j+1}\bmod n\). Thus, $$f_{i-1}\equiv f_{i+1}-f_i\equiv f_{j+1}-f_j\equiv f_{j-1}\bmod n$$By induction (Exercise for the reader: set up the induction properly), for all \(k\leq i,j\), \(f_{i-k}\equiv f_{j-k}\bmod n\). Letting \(k=\min(i,j)\), we have \(f_{|j-i|}\equiv0\bmod n\). Thus, \(f_{|j-i|}\) is positive and a multiple of \(n\).
If \(S\) is a set of \(10\) distinct positive reals, show that there exists two different reals \(x,y\) in \(S\) such that $$0\lt x-y\lt\frac{(1+x)(1+y)}9$$


Let \(T\) be the set of elements \(\frac1{1+x}\) for each \(x\) in \(S\). Clearly, \(T\) has \(10\) distinct elements in \((0,1)\). By the pigeonhole principle, among the \(9\) intervals \(\left(\frac n9,\frac{n+1}9\right]\), there exists an interval with at least two elements of \(T\). Call these elements \(\frac1{1+x}\lt\frac1{1+y}\), with \(x,y\) in \(S\). $$\frac n9\lt\frac1{x+1}\lt\frac1{y+1}\leq\frac{n+1}9$$So$$0\lt\frac1{1+y}-\frac1{1+x}\lt\frac19$$Thus by Simon's Favorite Factoring Trick, $$\begin{aligned}0\lt x-y&=\frac{x-y}{(1+x)(1+y)}\cdot(1+x)(1+y)\\&=\left(\frac1{1+y}-\frac1{x+1}\right)(1+x)(1+y)\\&\lt\frac19(1+x)(1+y)\end{aligned}$$

Challenge: Functional Equations

In algebra, we are usually presented with a function first, and then we study its properties. However, in a functional equation, we are given the property first, and then we must deduce all functions that have the given property. For example, the problem might be to find all functions \(f :\mathbb R \to\mathbb R\) such that$$f(f(x)-f(f(y)))=x-y$$for all \(x,y\in\mathbb R\), where \(\mathbb R\) is the symbol for the real numbers.

Before we start studying such equations, let's clarify some basic notation.

A function \(f:A\to B\) is a function with domain \(A\) and codomain \(B\). The domain of a function is the set of inputs to the function, and the codomain is the set of possible outputs.

In school, you probably have learned about ranges, rather than codomain. The range of a function is the set of actual outputs of the function. To see the difference, let's do an example. The function $$f:\mathbb R\to\mathbb R,\;f(x)=x^2$$has domain \(\mathbb R\) and codomain \(\mathbb R\), but the range of the function is the set of non-negative reals (sometimes written as \(\mathbb R^{\geq0}\)) since \(x^2\) outputs all non-negative reals.

Functional Equations in general are very difficult to solve. However, if you see them on a contest, you can bet that there is some nice trick that will make them solvable. We'll review some of the more common tricks.

Substitution

With many functional equations, the best place to start is to substitute in values for the variables. \(0\) and \(1\) are often good candidates, but any substitution which leads to a simplified equation is helpful.

Find all functions \(f:\mathbb R\to\mathbb R\) such that for all \(x,y\in\mathbb R\), $$f(f(x)-f(f(y)))=x-y$$


We might try \(x=0\) or \(y=0\), but these substitutions don't really help much. So how do we simplify the left hand side with a substitution? Note that if we let \(x=f(y)\), then we have \(f(x)-f(f(y))=0\), which gives us $$f(0)=f(y)-y$$This is very useful, because it tells us that \(f(y)=y+c\), where \(c\) is a constant. Let's check that this solution actually works. $$f(f(x)-f(f(y)))=(x+c-(y+2c))+c=x-y$$Thus, the solutions to this functional equation are of the form \(f(x)=x+c\).

Make sure you check any solutions to functional equations you find! You may realize that these solutions aren't actually solutions.

Separating Variables

If substitution isn't getting you anywhere, separating variables can be very useful. We saw this technique in the previous exercise, where we isolating \(y\) to one side of the equation, which allowed us to find the form of the solution. We can also apply this technique with multiple variables.

Find all functions \(f:\mathbb R\to\mathbb R\) such that for all \(x,y\in\mathbb R\)$$(x-y)f(x+y)-(x+y)f(x-y)=4xy(x^2-y^2)$$


Clearly, we can substitute \(a=x+y\), \(b=x-y\) into our functional equation to simplify it. Note that \(a,b\) can be any real values.$$bf(a)-af(b)=(a^2-b^2)ab$$We want to isolate \(a,b\) to different sides of the equation. Dividing both sides by \(ab\) and moving terms around gives us $$\frac{f(a)}a-a^2=\frac{f(b)}b-b^2$$With this isolation, we know that \(\frac{f(x)}x-x^2\) must be constant for all \(x\neq0\in\mathbb R\). So, $$f(x)=x^3+kx$$for constant \(k\). Now, let's check whether this is actually a solution to the functional equation. $$\begin{aligned}&(x-y)f(x+y)-(x+y)f(x-y)\\&=(x-y)\left[(x+y)^3+k(x+y)\right]-(x+y)\left[(x-y)^3+k(x-y)\right]\\&=(x^2-y^2)\left[(x+y)^2-(x-y)^2\right]\\&=4xy(x^2-y^2)\end{aligned}$$
Find all functions \(f:\mathbb R\to\mathbb R\) that satisfy the following equation for all \(x,y\in\mathbb R\)$$xf(x)-f(y)=(x-y)f(x+y)$$


Let's start by guessing some solutions. The identity function \(f(x)=x\) works, as does any function \(f(x)=ax\) for constant \(a\). Is that all of the solutions? Let's try the general form of a linear equation. If \(f(x)=ax+b\), we have $$\begin{aligned}x(ax+b)-y(ay+b)&=(x-y)(ax+ay-b)\\ax^2+bx-ay^2-by&=ax^2+axy+bx-axy-ay^2-by\end{aligned}$$Thus, all linear equations \(f(x)=ax+b\). How do we know that there aren't other solutions?

There doesn't seem to be an easy way to see if there are other solutions. So, let's try to simplify our work by substituting in a new function. Let \(g(x)=f(x)-f(0)\). Note that since \(f(x)=g(x)+f(0)\), \(g(x)\) has the same domain and codomain as \(f(x)\). And, if \(f(x)\) is a solution to the functional equation, $$x(f(x)-f(0))-y(f(y)-f(0))=(x-y)(f(x+y)-f(0))$$is an equality, so \(g(x)\) is also a solution to the functional equation. Why is this a useful substitution? Because \(g(0)=0\), let \(y=-x\) to get $$x(g(x)+g(-x))=0$$Thus, \(-g(x)=g(-x)\) for all \(x\) (since \(g(0)=0\)). We can use this fact by letting \(x=u\), \(y=-v\) in the original functional equation to get $$u(g(u))-v(g(v))=(u+v)g(u-v)$$Letting \(x=u\), \(y=v\) gives us $$u(g(u))-v(g(v))=(u-v)g(u+v)$$Since \(u\) and \(v\) can be any real numbers, so can \(u+v\) and \(u-v\). So, letting \(a=u+v\) and \(b=u-v\), we have the equation $$ag(b)=bg(a)$$In particular, with \(b=1\), \(g(a)=ag(1)\). This proves that \(f(x)=f(1)x+f(0)\), which shows that we have found all solutions.

Iterated Functions

When a function is applied repeatedly to an input, we call the resulting function iterated. For example, we can write$$g^3(x)=g(g(g(x)))=g\circ g\circ g(x)$$We often prefer the latter notation when there is a lot of iteration, as the parentheses become distracting and the exponent might be confusing.

With iterated functions, there are two things to look out for. The first are cyclic functions. A function \(g\) is called cyclic of order \(n\) if the following equations holds for all \(x\) in the domain of \(g\)$$\underbrace{g\circ g\circ\ldots\circ g}_{n\text{ times}}(x)=x$$Hidden cyclic functions are often the key to solving iterated functional equations.

Suppose that \(f:\mathbb R\to\mathbb R\) satisfies $$2f(x)+3f\left(\frac{2x+29}{x-2}\right)=100x+80$$What is \(f(3)\)?


Since we want to find \(f(3)\), we first try substituting \(x=3\) to get$$2f(3)+3f(35)=380$$Now, we want to find \(f(35)\), so we let \(x=35\) to get$$2f(35)+3f(3)=3580$$Surprised? This is because the function \(g(x)=\frac{2x+29}{x-2}\) is cyclic of order \(2\) (Prove it!). With these two equations, we have a linear system of equations we can solve, and we find \(f(3)=1996\).

The second concept to look out for is commutativity. If \(f^n(x)=g(x)\) for all \(x\) for some \(n\gt0\), then $$f^{n+1}(x)=f(g(x))=g(f(x))$$for all \(x\).

Find all functions \(f:\mathbb Z\to\mathbb Z\) (\(\mathbb Z\) is the set of integers) such that for all \(n\in\mathbb Z\)$$f(f(n))=n+1$$


If \(g(n)=n+1\), \(f(f(n))=g(n)\) for all \(n\). Thus, $$f(n+1)=f(g(n))=g(f(n))=f(n)+1$$Thus, the only possible solutions are of the form \(f(n)=n+c\) for constant \(c\). To see whether these solutions are actually solutions, we plug them back into the functional equation.$$f(f(n))=n+2c=n+1$$This implies \(c=\frac12\), but that would mean \(f(0)\) is not an integer! So, there are no solutions to this functional equation.

Fixed Points

Another recurring concept in functional equations is that of fixed points. Given a function \(f:A\to B\), \(x\in A\cap B\) is said to be a fixed point of \(f\) if \(f(x)=x\). Recognizing functions with fixed points can quickly simplify functional equations.

Find all functions \(f:\mathbb R^+\to\mathbb R^+\) (\(\mathbb R^+\) is the set of positive real numbers) satisfying the following for all \(x,y\in\mathbb R^+\)$$\begin{aligned}f(xf(y))=yf(x)\\\lim\limits_{x\to\infty}f(x)=0\end{aligned}$$


Letting \(x=y\), we have \(f(xf(x))=xf(x)\). Thus, \(xf(x)\) is a fixed point of \(f\) for all \(x\). In particular, this means that there is at least one fixed point of \(f\). Let \(S\) be the set of all fixed points of \(f\).

For any \(y\in S\), let \(x=1\) in the given functional equation, which implies $$f(f(y))=yf(1)$$Since \(y\gt0\), we have \(f(1)=\frac yy=1\), and thus \(1\) is a fixed point of \(f\). Are there any others?

There aren't, due to the second condition. Let \(y\in S\), \(x=\frac1y\) in the functional equation, which implies $$\begin{aligned}f\left(\frac1yf(y)\right)&=yf\left(\frac 1y\right)\\1&=yf\left(\frac 1y\right)\end{aligned}$$This implies \(\frac1y\in S\). Moreover, for any \(a,b\in S\), $$\begin{aligned}f(af(b))&=bf(a)\\f(ab)&=ab\end{aligned}$$Thus, \(ab\in S\). So, if \(y\gt1\in S\), then \(y^n\in S\) for any natural \(n\), which contradicts the limit approaching \(0\) (since \(y^n\) grows to infinity). On the other hand, if \(y\lt1\in S\), then \(\frac1y\gt1\in S\), which is a contradiction from our argument earlier. Thus, \(S=\{1\}\). More importantly, this tells us that \(xf(x)=1\) for all \(x\), since \(xf(x)\) is a fixed point for all \(x\). Thus, \(f(x)=\frac1x\) is the only possible solution. Plugging it into the functional equation shows that it is in fact a solution.

Injective, Surjective, Monotonic

Often, finding solutions to functional equations comes down to recognizing that they are either injective, surjective, monotonic, or some combination of these.

A function \(f\) is injective if for any \(x\neq y\) in the domain, \(f(x)\neq f(y)\)

We discussed injective mappings in the Challenge Section in Week 10, but the idea can be applied to functional equations as well. To summarize, an injective function is one in which each input has a unique output (no two inputs map to the same output).

Given the functional equation \(f(f(x))=x+f(x)\), observe that for any \(x,y\) such that \(f(x)=f(y)\), \(f(f(x))=f(f(y))\), so \(x+f(x)=y+f(y)\). Thus, \(x=y\), proving that any \(f\) satisfying the functional equation must be injective.
A function \(f\) is surjective if for any \(y\) in the codomain, there exists \(x\) in the domain such that \(f(x)=y\).

Surjective functions are ones in which the range and codomain are the same. In other words, for every possible output, there exists an input that maps to that output. As an example, the function \(f:\mathbb R\to\mathbb R\) which is defined as \(f(x)=x^2\) is not surjective since there is no \(x\in\mathbb R\) that maps to \(-1\).

A function \(f\) is called increasing if for all \(x\leq y\) in the domain, \(f(x)\leq f(y)\). A function \(f\) is called strictly increasing if for all \(x\lt y\) in the domain, \(f(x)\lt f(y)\). A function \(f\) is called monotonic if either \(f\) or \(-f\) is increasing. A function \(f\) is called strictly monotonic if either \(f\) or \(-f\) are strictly increasing.

Monotonic functions either always increase or decrease. With these three types of functions, we can learn a lot of about the space of possible solutions to a functional equation.

Find all \(f:\mathbb R\to\mathbb R\) such that for all \(x,y\in\mathbb R\)$$f(xf(y))+f(f(x)+f(y))=yf(x)+f(x+f(y))$$


None of the previous tactics we've discussed are much help here. However, observe that all of the inputs to \(f\) are just either \(x\), \(y\), or some combination of terms including \(f\). This is often a good hint that \(f\) might be injective.

Suppose \(a,b\in\mathbb R\) such that \(f(a)=f(b)\). We'd like to show that \(a=b\) to prove \(f\) is injective. Let \(y=a\) and \(y=b\) in the given functional equation to get $$\begin{aligned}af(x)&=f(xf(a))+f(f(x)+f(a))-f(x+f(a))\\bf(x)&=f(xf(b))+f(f(x)+f(b))-f(x+f(b))\end{aligned}$$Since \(f(a)=f(b)\), the right hand sides of both equations are equal, so \(af(x)=bf(x)\). Thus, either \(f(x)=0\) for all \(x\), or \(a=b\), implying injectivity.

It's not hard to see that \(f(x)=0\) is a solution to the functional equation. If \(f(x)\neq0\), it is injective. To use this, ideally we want to have an equation of the form \(f(\text{something})=f(\text{other thing})\), so that we can have something and other thing be equal. By playing around with values, we get this sort of equation by setting \(x=0\) and \(y=1\) to get $$f(f(0)+f(1))=f(f(1))$$With this, we know that \(f(0)+f(1)=f(1)\), and thus \(f(0)=0\). Now, plug \(y=0\) into the functional equation to get \(f(f(x))=f(x)\). Since \(f\) is injective, this implies \(f(x)=x\). We can check that this also is a solution. Thus, we have two solutions: \(f(x)=0\) and \(f(x)=x\).

Many times, like the problem above, we aren't always able to prove injectivity of the function in the functional equation. But, the process of doing so helps progress our solution.

Find all \(f:\mathbb R\to\mathbb R\) such that for all \(x,y\in\mathbb R\)$$f(f(x)+y)=2x+f(f(y)-x)$$


We can try the tactics dicussed in previous sections, but none really make much progress. Trying to prove injectivity of \(f\) is an interesting idea, but the \(+y\) and \(-x\) terms make it challenging. Instead, let's try to prove that \(f\) is surjective.

For any \(b\in\mathbb R\), we want to find some \(a\in\mathbb R\) such that \(f(a)=b\). Let \(y=-f(x)\) in the functional equation to get $$f(0)-2x=f(f(-f(x))-x)$$As gross as this looks, it proves surjectivity, since for any \(b\in\mathbb R\), letting \(x=\frac{f(0)-b}2\) and \(a=f(-f(x))-x\) shows \(f(a)=b\).

Now let's use surjectivity. As a first idea, perhaps the fact that there exists some \(c\in\mathbb R\) such that \(f(c)=0\) is useful. Letting \(x=c\) gives us $$f(y)=2c+f(f(y)-c)$$Since \(f\) is surjective, \(f(y)\) can be any real value, so we can substitute \(x\) for \(f(y)\) to get $$x=2c+f(x-c)$$So, \(f(x)=x-c\) is the only class of possible solutions. We can plug this back into the functional equation to see that it works for all constants \(c\).

Surjectivity is not all that common to see in functional equation problems, but a tell-tale sign to use it is when you can write the functional equation in a form \(f(\text{mess})=\{something\}\), and the "something" can be any value of the codomain.

Let \(\mathbb N^+\) be the set of positive integers. Find all strictly increasing functions \(f:\mathbb N^+\to\mathbb N^+\) such that \(f(2)=2\) and for all positive integers \(m,n\), \(f(mn)=f(m)f(n)\).


If \(m=2\), we have \(f(2n)=2f(n)\). Thus, \(2=f(2)\lt f(3)\lt f(4)=4\). This implies \(f(3)=3\). In fact, we can generalize this via induction. Our inductive hypothesis is that for any positive integer \(m\) and any positive integer \(n\leq 2^m\), \(f(n)=n\).

Base Case: \(m=1\): Since \(f(2)=2\), \(f(1)=1\) since the function is monotonic.
Inductive Case: Assume the claim is true for \(m\). Then, \(f(2^m)=2^m\), and thus \(f(2^{m+1})=f(2\cdot2^m)=2^{m+1}\). And, we have $$2^m=f(2^m)\lt f(2^m+1)\lt f(2^m+2)\lt\ldots\lt f(2^{m+1}-1)\lt f(2^{m+1})=2^{m+1}$$We have to fit \(2^{m+1}-2^m-1\) values strictly in between \(2^m\) and \(2^{m+1}\), which is only possible if \(f(n)=n\) for all \(2^m\lt n\lt 2^{m+1}\). Thus, \(f(n)=n\) is the only possible solution, which can be checked to be a solution.

The approach used in this problem is known as squeezing, and is very common in problems with monotonic conditions. Squeezing only works when the codomain is discrete (meaning that between adjacent numbers, there are no numbers.)

Traps!

Now that we've covered plenty of approaches to solving functional equations, let's go over some common traps/mistakes to make when solving them.

The most common one is unexpected solutions. Often, when solving functional equations, we have to make a guess as to the form of the functions that satisfy them. While this may lead to some solutions, trying to prove that these solutions are the only ones can be odious. If you are struggling, consider the possibility that there are solutions you haven't accounted for.

Find all functions \(f:\mathbb R\to\mathbb R\) such that \(f(x+y)=f(x)+f(y)\)


Clearly, \(f(x)=cx\) is a solution for all \(c\in\mathbb R\). You might try to prove that all solutions are of this form, but you'd be stuck. Why? Well, because there are other solutions! Using the Axiom of Choice, you can show that there are infinitely many other solutions, none of which are easily describable.

As silly as the above example is, unexpected solutions show up all of the time. Just because you can't think of a solution, doesn't mean it is not there. Here's another example.

Find all functions \(f:\mathbb Z\to\mathbb Z\) such that for all integers \(a,b,c\) that satisfy \(a+b+c=0\)$$f(a)^2+f(b)^2+f(c)^2=2f(a)f(b)+2f(a)f(c)+2f(b)f(c)$$


Setting \(a,b,c=0\) gives us \(f(0)=0\). Setting \(a=0\), \(b=-c\) gives us $$(f(b)-f(-b))^2=0$$Which implies \(f(b)=f(-b)\). If we set \(a=b\) and \(c=-2a\), we get $$f(2a)^2=4f(2a)f(a)$$Assuming \(f(2a)\neq0\), we get \(f(2a)=4f(a)\). From this, you may suspect the solutions are of the form \(f(x)=cx^2\), which all work. Are these the solutions?

The answer is no. We've ignored the possibility that \(f(2a)=0\), and the possibility that \(f(2a)=4f(a)\). The correct solution are the following functions for any integer \(a\):
  • \(f(n)=an^2\)
  • \(f(n)=0\) for even \(n\) and \(f(n)=a\) for odd \(n\)
  • \(f(n)=0\) for \(n\equiv0\bmod4\), \(f(n)=a\) for odd \(n\), \(f(n)=4a\) for \(n\equiv2\bmod4\)

The final trap is quite difficult to explain as it is quite a nuanced trap, and thus won't appear in most competitions. However, many contest writers have included this trap many times over the years, so I'd feel wrong to not include it.

Find all functions \(f:\mathbb R^+\to\mathbb R^+\) such that for all \(w,x,y,z\in\mathbb R^+\) satisfying \(wx=yz\)$$\frac{f(w)^2+f(x)^2}{f(y^2)+f(z^2)}=\frac{w^2+x^2}{y^2+z^2}$$


We start with some basic substitutions. Setting all of the variables equal gives us $$f(w)^2=f(w^2)$$With \(w=1\), we have \(f(1)=1\) (since \(0\) is not in the codomain). This allows us to rewrite the functional equation as $$\frac{f(w^2)+f(x^2)}{f(y^2)+f(z^2)}=\frac{w^2+x^2}{y^2+z^2}$$Since \(w^2,x^2,y^2,z^2\) are arbitrary positive reals, we can let them be \(a,b,c,d\) respectively with \(ab=cd\) to rewrite the functional equation as $$\frac{f(a)+f(b)}{f(c)+f(d)}=\frac{a+b}{c+d}$$We might try letting \((a,b,c,d)=(t^2,1,t,t)\) for positive \(t\), as this allows us to exploit \(f(w^2)=f(w)^2\). Plugging these values in gives us $$\begin{aligned}tf(t)^2+t=(t^2+1)f(t)\\(tf(t)-1)(f(t)-t)=0$$Thus, \(f(t)=t\) or \(f(t)=\frac1t\) for each \(t\), so \(f(x)=x\) and \(f(x)=\frac1x\) are our two solutions. Right?

In the partial solution above, we find that for every positive real \(t\), either \(f(t)=t\) or \(f(t)=\frac1t\). But that doesn't mean for all \(x\), \(f(x)=x\) or for all \(x\), \(f(x)=\frac1x\). What if \(f(x)=x\) for all \(x\in(0,1)\) and \(f(x)=\frac1x\) for all \(x\geq1\)? That satisfies the first statement, but not the second. This is known as the pointwise value trap.

In fact, the two solutions we found are the only two solutions, but more work needs to be done to show this (can you complete the proof?)

To summarize, functional equations are really hard, and the only way to get used to the tools above is to practice. Try applying the tools in order of how they appeared in this section (first try substituting, then separating variables, etc).

Find all functions \(f:\mathbb R\to\mathbb R\) such that $$f(x^2-y^2)=xf(x)-yf(y)$$

Let \(f:A\to A\) be a function such that for all \(x\in A\)$$\underbrace{f\circ f\circ\ldots\circ f}_{n\text{ times}}(x)=x$$Prove that \(f\) is both injective and surjective.
Top