C5W3A2 - Trigger Word Detection

how did we get these two values (101, 5511):
Time steps in audio recording before spectrogram (441000,)
Time steps in input after spectrogram (101, 5511)

There are a few things you need to check out to find out the answer yourself:

  1. You need to understand Discrete Fourier Transform to know what was being done.

  2. The shape (101, 5511) is of the variable x which is produced by the function graph_spectrogram, so you need to check out the code of the function in the td_utils.py file for what underlying function is being used to actually produce x

  3. There are two parameters in graph_spectrogram that controls the shape (101, 5511) and they are nfft and noverlap.

I recommend you to

  • experiment different values of nfft and noverlap to see how it changes the shape of x
  • google about that underlying function for its documentation and discussion on stackoverflow
  • google materials that suits your background to learn about Discrete Fourier Transform
  • besides x, that underlying function also produces other outputs, and checking them out may help

Good luck for your investigation!

Raymond

2 Likes

Thank you