PPO, how do we compute de advantage and the Value Function?/

When creating a post, please add:

I am going over the formula, I understand most of it but I cannot gauge how do we compute the Advantage? where does it comes from? I heard there is a recursive formula, can people go a bit further on this?

Also, where is this Value function? is this inside the LLM? how do we get it?