Can positional encoding be meaningfully generalized to non-integer position?

kechan · July 18, 2021, 12:31am

One of the new thing I learnt over Transformer is the use of positional encoding. It is a quite fascinating concept and if mentors know if there are any papers or resources that focus on theoretical (and intuitive) properties of good positional encodings in general, please post.

I also wonder out loud if “pos” in the sin/cosine way of doing positional encoding as in transformer, can be meaningfully generalized to non-integer position. E.g. if your sequence is a time series and each position is annotated by the time the event took place (which is a float).

And furthermore, this reminded me (at least superficially) Fourier Transform. Were the researchers motivated by this?

manifest · July 18, 2021, 8:19am

Hey @kechan, I totally agree with you on position encoding

You may find this overview interesting. I haven’t read it yet though. It’s worth investigating what type of position encoding are demonstrated a good results on time-series data rather than adapting sin/cos.

F-Nets demonstrated good results on some tasks.

kechan · July 18, 2021, 5:20pm

Thanks. I am actually going through the ungraded lab on Transformer Preprocessing, it is actually a good start.

Topic		Replies	Views
Week 4: Transformer network Sequence Models coursera-platform	2	537	October 5, 2021
Position encoding: Time series Sequence Models coursera-platform	2	572	May 2, 2022
Transformer Pre Processing Lab Question Sequence Models coursera-platform	1	533	June 29, 2022
Positional_encoding Sequence Models coursera-platform	1	870	June 27, 2021
Week 4 Positional Encoding Sequence Models week-module-4 , coursera-platform	5	290	April 18, 2024

Can positional encoding be meaningfully generalized to non-integer position?

Related topics