Question 3:
According to the feedback to question 3, this equation (screenshot) is called the incremental update rule and it has the purpose of overcoming the problem of recomputing metrics when using sequential analysis to detect concept drift:

This equation is indeed present at the bottom of one of the lecture slides for sequential analysis, but no mention or explanation of it is given.

Question 4:

Drift detection techniques in unsupervised settings typically suffer from the curse of dimensionality. Which of the following techniques is an appropriate solution to mitigate the effects of this curse?

k-means is rejected as a possible correct answer.

But I thought k-means could also be used to reduce the number of dimensions, am I wrong?

Thank you very much for your questions. For question 4, I should point out that, K-means tries to cluster your data given the distance to some center points. Thus, for the model to work, you need to calculate distance between different datapoints and mass centres. The problem with curse of dimensionality is that the space is so sparse that different datapoints are extremely far away from one another, so a measurement like distance will not be very helpful. For methods like k-means to work, you first need to map your datapoints to a lower dimensional space (e.g., with PCA) and then use your algorithm (e.g., k-means). I hope that answers your question.
Could you please provide more detail about what seems to be vague for you in question 3?

For Q3, the equation (according my original question text) was not explained.

For me personally, to understand an equation, it should be immediately followed by explanation of the annotation, followed by an example of how this equation is used and maybe a worked example.

E.g. “Where Pt* denotes …, n* denotes, t denotes, Iyt denotes, yhat t denotes”.