Transformer Pre Processing Lab Question

" Another interesting property is that the norm of the difference between 2 vectors separated by k positions is also constant. If you keep k constant and change pos , the difference will be of approximately the same value. This property is important because it demonstrates that the difference does not depend on the positions of each encoding, but rather the relative seperation between encodings. Being able to express positional encodings as linear functions of one another can help the model to learn by focusing on the relative positions of words."

Can someone provide some example inputs and use case that demonstrates the k positions property mentioned above?

For those who find this thread later.
This article may be useful: