C3w4 - defensive distillation & residual analysis unclear

1. Defensive Distillation:
I sort of understand that it works a bit like knowledge distillation but without the smaller-sized networks. But no idea what sample creation is referring to in the following sentence ? What sample?

“Defensive distillation reduced the effectiveness of a sample creation from 95% to less than 0.5% in one study.”


2. Residual Analysis:

I got that this involves calculating something called residuals and that a random distribution of residuals indicates the model is working well. And I get that residuals are calculated in a similar way to a cost function in ml regression, such that you can use something like the root mean squared error calculation, and is also the distance between predictions and true values. But I don’t understand the following sentences:

“The residuals should not be correlated with another feature that was available but was left out of the feature vector.”

How can rmse-like values correlate or not correlate with a feature ? In what way ?

“Also, adjacent residuals should not be correlated with each other, in other words, they should not be autocorrelated.”

How can they be adjacent ? In what context can they be adjacent or not adjacent ?

Both of these suggest to me I have a conceptual misunderstanding. What do these residuals actually look like and what is meant by a ‘feature’ or being ‘adjacent’, in these contexts ?

3. Resources/papers
Are there any recommended resources/papers for these two topics ?