I noticed some mistranscriptions in the transcript (subtitles, captions) for the course videos. Below is the errata for the 3rd week. I’ll post the errata for the other weeks in those groups.
The format for the errata is as follows. The approximate time is in braces, then the mistranscribed word or phrase follows, followed by a hyphen, then followed by the correct word or phrase. Like this:
[time] mistranscribed word or phrase - correct word or phrase
video 1 - what is reinforcement learning?
[2:08] reinforced - reinforcement
video 2 - Mars rover example
[1:59] ant - at
[4:58] better - a bit
video 3 - The Return in reinforcement learning
[3:53] parsing - passing
[5:57] low - at all
video 6 - state-action value function definition
[0:08] arrow - algorithm
[1:07] auto - optimal
[1:13] auto - optimal
[1:25] outcomes - algorithms
[3:27] to to - two to
[3:37] state to - state two
[4:23] ST - state
[4:27] stage - state
[4:56] Q - cubed
[5:08] year - here
[5:50] noted - denoted
[6:10] ST S - State S
[6:22] values two - values Q
[6:35] state to - state two
[6:55] in therefore - in state four
[7:11] Q, F, S - Q of s, a
[7:20] stay for - start four
[7:22] two of - Q of
[8:39] pi F - pi of s
[9:18] action aid - action a
[8:57] sudden status - start in the state s
[9:29] cause - course
[10:15] auto - optimal
video 7 - state-action value function example
[0:22] [INAUDIBLE] - Mars Rover
[0:53] gamble - gamma
video 8 - Bellman equation
[1:48] steady - state
[5:06] our 40 -
[6:41] E prime - a prime
video 9 - Random (stochastic) environment
[1:17] 9 percent chance of ending up in say three - 90 percent chance of ending up in state three
[1:51] stages - states
[2:11] maybe your loop and lucky - maybe you’re a little bit lucky
[2:41] see - say
[2:45] hear - here
[2:52] you’re taught to call - you tell it to go
video 10 - example of continuous state space applications
[1:30] Russia toy truck - or actually a toy truck
video 11 - lunar lander
[0:47] trashed - crashed
[1:48] farther left thruster - fire left thruster
[1:58] Maine - main
[2:00] states face - state space
[2:23] fated not - theta dot
[4:29] fear - fewer
[5:26] camera - gamma
video 12 - learning the state-value function
[5:33] new network - neural network
[5:48] further, left fasser, further right fasser further - fire the left thruster, fire the right thruster, fire the
[6:40] thrusts there - thrusters
[6:45] want up - wound up
[7:02] want up - wound up
[9:16] low dataset - little dataset
[9:41] [inaudible] - algorithm
[10:59] new network - neural network
[11:36] widths - weights
[12:19] actually gathering what R - action a get a reward R
video 13 - Algorithm refinement: Improved neural network architecture
[2:21] maximizes a gamma - multiplied by gamma
[2:24] dual network - neural network
[2:42] RN - algorithm
[4:11] explore - exploit
[8:19] learn the lunar lander - land the lunar lander
video 14 - Algorithm refinement: Mini-batch and soft updates
[1:13] gradient in this algorithm - gradient descent algorithm
[1:19] [inaudible] - the learning rate
[8:57] S [inaudible] - as described
video 14 - The state of reinforcement learning
[0:33] his - its
[1:57] supervising and supervised - supervised and unsupervised
video - Andrew Ng and Chelsea Finn on AI and Robotics
[0:27] electric engine - electrical engineering
[1:50] mason - Marc Andreessen
[7:11] our rooms - algorithms
[7:34] **** a water bowl - screw a water bottle
[8:29] I want to be - I wonder if it is
[8:30] hot - hard
[8:33] lot of bottles - water bottle
[13:38] metal learning - meta learning
[14:06] metal learning - meta learning
[14:22] Standard - Stanford
[17:37] tha - that
[19:49] to your heart - [?]
[24:22] trading - training
[24:32] hope - help
[31:28] enforcement - reinforcement