So even reinforcement learning is computing all possible ways like a pre-determined function, what makes it different from RL and what advantages RL can exhibit, when compared to pre-determined function?
Welcome to the Community!
The main difference is the RL is a training method based on rewarding desired behaviors and/or punishing undesired ones and also the RL is based on interactions between an AI system and its environment(‘’ looking for taking suitable action to maximize reward in a particular situation and It always learn from the decision it took ‘’), Unlike The pre-determined Model it’s trained only once and didn’t Know is the decision it took is correct or not it’s according to the model parameters and the model weights, if it take an wrong decision it didn’t punish it self and vice versa
Briefly machine learning is enables computers to learn from data, whereas reinforcement learning is a type of machine learning that allows machines to learn how to take actions in an environment so as to maximize a reward.
No I am not asking the difference between pre-trained model and RL. I am asking about RL and normal function code which is return in any language like pre-determined function in programming. So even reinforcement learning is computing all possible ways like a pre-determined function, what makes it different from RL and what advantages RL can exhibit, when compared to pre-determined function?. Please read the question again and correct me if I am wrong.
Please refer to this image below for better understanding of my question:
@Arjun_Reddy OK, First we Know the all possible ways but we didn’t Know all rewards and punishment of all possible ways (in this example we know the possible rewards because it’s for give us an intuition of explanation), as if we Know the rewards of all possible ways we didn’t want to make a machine learning or RL models as we only doing conditions of all possible ways and choose the best way, but the impossible as want the model learn form the actions in an environment so as to maximize a reward so that’s couldn’t by the pre-determined function because we didn’t know all rewards of possible ways, also the main difference is the RL always learning from any decision it take unlike the pre-determined function besides that the RL is a training method based on rewarding desired behaviors and/or punishing undesired ones which isn’t found in the pre-determined function
Got it, But just want to know how RL model learns from its past actions in an environment without knowing all possible rewards, It would be a great help, if you could explain me with an example in which the RL model learns from its past actions without knowing all rewards?
In the training we Know the rewards and the punishments of all possible places, but the RL model didn’t know that it’s interact with environment and according to the next place(step) it took in this conditions, the value of it’s rewards increase or decrease and the parameter of the RL model updated, but if the model trained we make an test examples conditions and look how the model will do with it to evaluate the RL model, after that we left the model to interact with real live data (conditions) and in this case we of course didn’t know the rewards, and punishment of all steps we let the model evaluate it self by the experience which it took in the training and test phase