How does Algorithm refinement: Improved neural network architecture helps in computing RHS of Bellman's equation

As said between 2:25-2:40 in the lecture on Algorithm refinement, we get Q for actions in a state s. How would this refinement of Neural network increase the efficiency of computing max Q(s’,a’) that is actually related to the next state s’ ?

Hello @robincs,

For reference, above is the before and after of the refinement. Two points I want to make:

First, the refinement allows a shorter input vector which means less trainable parameters to compute, and it produces all Q values (corresponding to all actions) in one forward pass of the network. These make it compute less and thus more efficient!

Second, you particularly mentioned s’ without asking a specific question about it, but I am taking the liberty to comment on s’. You see only s (not s’) in the networks presented in the slides above, and that s refers to “any state” which means it can be “current state” or “next state”. In fact, to compute max Q(s’, a’), we set s = s’ and pass it to the network.

Cheers,
Raymond

1 Like