First of all, we may need to recap the equations for “Exponentially Weighted (Mean) Average”, starting from 0. (Andrew started in a reverse order, but, let’s start with a normal order, then, back to his way.)
\begin{equation}
v_0 = 0 \ \ \ \ (Actually,\ this\ can\ be\ any\ value.) \\
v_1 = \beta v_0 + (1-\beta) \theta_1 \\
v_2 = \beta v_1 + (1-\beta) \theta_2 \\
v_3 = \beta v_2 + (1-\beta) \theta_3 \\
\end{equation}
:\\
From v_2, we can rewrite by using previous equation as follows.
\begin{align}
v_2 & = \beta v_1 + (1-\beta)\theta_2 \\
&= \beta (\beta v_0 + (1-\beta)\theta_1) + (1-\beta) \theta_2 \\
&= \beta^2v_0 + \beta(1-\beta)\theta_1 + (1-\beta)\theta_2\\
v_3 & = \beta v_2 + (1-\beta)\theta_3 \\
&= \beta (\beta^2v_0 + \beta(1-\beta)\theta_1 + (1-\beta)\theta_2) + (1-\beta) \theta_3 \\
&= \beta^3v_0 + \beta^2(1-\beta)\theta_1 + \beta(1-\beta)\theta_2 + (1-\beta) \theta_3 \\
&:\\
\end{align}
Then, let’s use \theta_0 as a replacement of v_0 which can be non-zero value, and describe how v_{100} looks like.
v_{100} = \beta^{100}\theta_0 + \beta^{99}(1-\beta)\theta_1 + \beta^{98}(1-\beta)\theta_2 +\ ..\ + \beta(1-\beta)\theta_{99} + (1-\beta)\theta_{100}
Then, Andrew splits this into two vectors. One is for coefficients, and the other is \theta. A vector for coefficients is exactly showing the weights for \theta.
If \beta = 0.9, then, coefficients can be seen as follows.
This corresponds to Andrews 2nd sketch.
The first sketch is a vector for \theta.
Then, Andrew said that roughly 10 days, i.e, 10 coefficients from 100th day in a reverse order, are (weighted) averaged. (Note that this is not “average” actually, but is a summation.)
If you look at the figure, actually, summation of 10 days is not enough to be, like 95% confidence level. (It’s not a matter of 10 days, or 11 days, actually…)
Andrew’s intuition is sometimes not correct from a math view point. 
But, from “intuition” view point, Andrew’s talk makes sense if we look at the following figures.
In the case of \beta = 0.2, we only need a few days, and do not need to summing up all past 100 days with weights (coefficients), but in the case of \beta=0.98, we need to use all coefficients, i.e, summing up all 100 days with weights. The days to be considered may not be exactly equal to \frac{1}{1-\beta}.
Regarding your 2nd question, it is one of variation of “Eular’s number” definitions.
Most famous one would be;
\lim_{n \to \infty}(1+\frac{1}{n})^{n} = e
But, this is also one of variations.
\lim_{\epsilon \to 0}(1-\epsilon)^{\frac{1}{\epsilon}} = \frac{1}{e}
Andrew used this, and set \epsilon=0.02. Then, the result is close to \frac{1}{e}.