**1. Defensive Distillation:**

I sort of understand that it works a bit like knowledge distillation but without the smaller-sized networks. But no idea what `sample creation`

is referring to in the following sentence ? What sample?

“Defensive distillation reduced the effectiveness of a sample creation from 95% to less than 0.5% in one study.”

**2. Residual Analysis:**

I got that this involves calculating something called `residuals`

and that a random distribution of residuals indicates the model is working well. And I get that residuals are calculated in a similar way to a cost function in ml regression, such that you can use something like the root mean squared error calculation, and is also the distance between predictions and true values. But I don’t understand the following sentences:

“The residuals should not be correlated with another feature that was available but was left out of the feature vector.”

How can rmse-like values correlate or not correlate with a feature ? In what way ?

“Also, adjacent residuals should not be correlated with each other, in other words, they should not be autocorrelated.”

How can they be adjacent ? In what context can they be adjacent or not adjacent ?

## Both of these suggest to me I have a conceptual misunderstanding. What do these residuals actually look like and what is meant by a ‘feature’ or being ‘adjacent’, in these contexts ?

**3. Resources/papers**

Are there any recommended resources/papers for these two topics ?