The following line:
maxLen = len(max(X_train, key=len).split())
should be:
maxLen = len(max(X_train, key=lambda x: len(x.split())).split())
The intention is trying to find out the max number of words among X_train elements, but the original code is trying to measure the number of words from the X_train elements that have the most chars.
This is a hidden bug since the actual X_train used in the example happen to have the two measures (as described above) the same.