Hi,
why are the probabilities in the rnn_step_forward function in utils.py called unnormalized log probabilities? This is the 2nd assignment of the 1st week.
Maybe, only the second comment probabilities for next chars should be left?
def rnn_step_forward(parameters, a_prev, x):
Waa, Wax, Wya, by, b = parameters['Waa'], parameters['Wax'], parameters['Wya'], parameters['by'], parameters['b']
a_next = np.tanh(np.dot(Wax, x) + np.dot(Waa, a_prev) + b) # hidden state
p_t = softmax(np.dot(Wya, a_next) + by) # unnormalized log probabilities for next chars # probabilities for next chars
return a_next, p_t
Shouldn’t only the arguments of the softmax function be called unnormalized log probabilities?
Softmax is defined in this way (for the sake of simplicity, I didn’t include max(z) subtraction that is included in the original definition in utils.py):
If we take the -log(p_{j}) that would be proportional to z_{j} but not normalized. Now it makes sense to me to call z_{j} unnormalized log probability. The expression for -log(p_{j}) is:
and the proportionality coefficient is equal to -1:
As an example of it in Python:
-z = np.random.randn(5)
print(-np.log(softmax(z)))
[1.77946694 1.5168839 2.20200317 0.82859276 2.73903499]
that is clearly not normalized but proportional to z_j
k = -((np.log(softmax(z)) + np.log(np.sum(np.exp(z))))/z)
print(k)
[-1. -1. -1. -1. -1.]
Is my reasoning correct?