# Intersection of Cantor Sets

I will write about an interesting talk given by Pablo Shmerkin.

Main goal: computing $dim(A\cap B)$ for $A,B\subset \mathbb{R}$ with dynamics properties.

Theorem [Marstrand, 1954] $E\subset\mathbb{R}^2$ be a Borelian set. Given $\theta\in (0,\pi),$ we define $L_{\theta}$ to be the family of all the lines on the plane with slope $\theta.$ Then for almost all $l\in L_{\theta}$ we have that $dim(E\cap l)\leq \max (dim(E)-1,0).$

Example
The graph $G$ of a Brownian motion on the plane has $dim(G)=\frac{3}{2},$ however, vertical lines have intersection of HDIM equal to zero.

Theorem [Marstrand, 1954] $E\subset\mathbb{R}^2$ Borelian set. $ess \mbox{ } \sup dim(E\cap l)=\max (dim(E)-1,0).$

We can obtain the following corollaries for the HDIM of the intersection of subsets $A,B\subset \mathbb{R}.$

Corollary Let $A,B\subset\mathbb{R}$ be Borelian set. Then for Leb almost all $t$ we have that $dim(A\cap (B+t))\leq \max (dim(A\times B)-1,0).$

Remember that $dim(A\times B)\geq dim(A)+dim(B),$ and that there is equality if $dim(A)=\overline{dim_{BOX}(A)}.$ Therefore, we have the following corollaries.

Corollary If $A\subset\mathbb{R}$ is a Borelian set such that $dim(A)=\overline{dim_{BOX}(A)},$ then for Leb almost all $t$ we have that $dim(A\cap (B+t))\leq \max (dim(A)+dim(B)-1,0).$

And from the second theorem of Marstrand, we have the following corollary.

Corollary $ess \mbox{ } \sup dim(A\cap g(B))=\max (dim(A\times B)-1,0),$ where $g:\mathbb{R}\to\mathbb{R}$ is lineal.

Some examples.

Example
There exists compact $A\subset \mathbb{R}, dim(A)=1$ such that the cardinality of $A\cap(A+t)\leq 1$ such that $dim(A\cap(A+t))=0$ and $\max (dim(A)+dim(B)-1,0)=1.$

For all $s\in[0,1], \exists A\subset \mathbb{R}, dim(A)=s$ and $dim(A\cap (A+t))=s$ for all $t\in\mathbb{R}.$ [Falconer example]

Furstenberg asked what happened if $A,B$ were invariant for the map $T_p(x)=px \mod1.$ Motivated by this questions, let us consider first the classic Cantor set $C\subset \mathbb{R},$ whose $dim(C)=\overline{dim_{BOX}(C)}=\frac{\log 2}{\log 3}.$ Applying directly Marstrand’s theorem we obtain the bound $dim(C\cap (C+t))\leq 2\frac{\log 2}{\log 3}-1\approx 0.262$ for almost all $t.$ There is however a better bound by Hawkes.

Theorem [Hawkes, 1975] $dim(C\cap (C+t))= \frac{1}{3}\frac{\log 2}{\log 3}\approx 0.21$ for almost all $t.$

Kenyon and Peres generalized further in 1991 the result to the case of $p\geq 2,$ $D_1,D_2\subset \{0,1,\ldots,p-1\},$ $A_i=\{\sum x_j p^{-j}: x_j\in D_i\}.$ Notice that $C=\{\sum x_j 3^{-j}: x_j\in \{0,2\}\}.$

Theorem [R. Kenyon and Y. Peres, 1991] For almost all $t\in [0,1]$ $dim(A_1\cap (A_2+t))= \frac{\lambda}{\log p}.$ Where $\lambda=$ greater Lyapunov exponent for the product of iid matrices $M_r,$ where $M_r(i,j)=\# (D_1+i+r)\cap (D_2+j+p),$ $r=0,\ldots, p-1, 0\leq i,j\leq 1.$

Barany and Rams in 2014 considered the case of self-similar sets $E\subset [0,1]^2$ that are obtained by a generalization of the set $C\times C$ (ex. consider the unit square $[0,1]?2,$ divide it in 9 squares of side $1/3,$ numerate the squares by $\{1,2,\ldots,9\}=\{0,1,3-1\}^2$. Choose $4$ squares, say $\Omega=\{1,2,5,9\},$ delete the rest of the squares and repeat the procedure on each of the $4$ squares left, etc.).

Theorem [B. Barany and M. Rams, 2014] Let $E\subset \mathbb{R}^2$ a set constructed doing the described process where $\Omega\subset \{0,1,\ldots,p-1\}^2,$ $p\nmid \#\Omega$ and $\#\Omega>p.$ If $\tan \theta \in \mathbb{Q},$ then $\exists \alpha(\theta),\beta(\theta)$ with $\alpha(\theta)>dim(E)-1>\beta(\theta),$ $dim(E\cap l)=\alpha(\theta)$ for $\nu$-almost all $l\in L_{\theta},$ where $\nu$ is a natural measure with support on $E$ and $dim(E\cap l)=\beta(\theta)$ for Leb-almost all $l\in L_{\theta}.$

The non rational case was recently studied by Pablo.

Theorem [P. Schmerkin, 2016] Let $E\subset \mathbb{R}^2$ a closed set and invariant for $T_p(x,y)=(T_p(x),T_p(y)).$ If $\tan \theta \notin \mathbb{Q}$ then $dim(E\cap l)\leq \max(dim(E)-1,0)$ for all $l\in L_{\theta}.$

The proof of the theorems follows from an application of a tool now known as the BSG (Balog-Szemer\’edi-Gowers) theorem, which has found many further applications. In the words I can remember… is a theorem in combinatorics proved with graphs that allows to have a structure on the integeres, it allows to find lower bounds by convoluting indicatrix functions and obtained a $l^2$ norm that can by bound with an $l^1$ norm apart from a small set. Remember that the BSG theorem is the main ingredient of the 1998 proof of the first effective bounds for the Szemer\’edi theorem, showing that any subset $A\subset\{1,2,\ldots,N\}$ free of $k$-term arithmetic progression has cardinality $O(N(\log \log N)^{-c_k})$ for an appropriate $c_k>0.$

Notice that the theorem works in particular for $E=C\times C.$ Pablo was not convinced of the conjecture that indeed $dim(E\cap l)\in \{dim(E)-1,0\}$.

As a corollary we conclude that if $\lambda\notin\mathbb{Q},$ then for all $t\in \mathbb{R},$ $dim(C\cap (\lambda C +t))\leq 2\frac{\log 2}{\log 3}-1.$

The following theorem was obtained independently in 2016 by P. Schmerkin and M. Wu:

Theorem [P. Schmerkin, 2016] If $p$ and $q$ are not powers of the same integer, $A,B$ are closed sets, $T_p,T_q$-invariants, then
$dim(A\cap (\lambda B+t))\leq \max(dim(A)+dim(B)-1,0)$ for all $\lambda\neq 0, t\in\mathbb{R}.$

Suppose you want to know the number of fish in a lake, say $n.$ It is extremely difficult to count them, however you can estimate $n$. In order to do this, you can take $p$ fish and mark them, then allow them to return to the lake. After some time, you take $m$ fish and you count how many are marked. If  you assume that after enough time the proportion of marked fish is the same at “every scale”, i.e., if you take any amount $k$ of fish you will always obtain a number $r(k)$ of marked ones so that $\frac{r(k)}{k}=\frac{p}{n},$ then you can estimate $n$ by $\frac{p m}{r(m)},$ where $r(m)$ is the number of marked fish you obtained when you took in total $m$ fish. Mathematically the validity of this statement is justified if you have the mixing property. This is a strong condition of the system.
Another strategy for estimating $n$ is to capture $m$ fish each $10$ days, count how many are marked, and then allow them to return to the lake. Suppose the $i$-th time you captured fish you obtained $r_i$ marked ones. If you do this many times, say $N,$ then you can estimate $n$ by $\frac{Np}{\sum_{i=1}^N\frac{r_i}{m}}= \frac{Npm}{\sum_{i=1}^N r_i}.$  Mathematically the validity of this statement is justified by the ergodic property. This is weaker than the mixing condition of the system needed in the previous estimation strategy.