Errors in transscription - why don't you use AI to automate fixes?

The transcription is full of errors, and I assume is done by automated AI. This is unfortunate, as I need the transcription to be able to watch the videos on higher speeds (why is no higher speed than 2 available…?).

It would actually be trivially easy to fix 99% of these errors using ChatGPT4, in an automated way. Here is an example:

Prompt:
The following is an the output of a text to speech program. Please flag any words you think might be mistranscribed, and guess at what the correct word is:

"
Now, even if you have only 20 positive training examples,
it might be okay to apply a supervised learning algorithm.
But it turns out that the way anomaly detection looks at the data set
versus the way supervised learning looks at the data set are quite different.
She is the main difference, which is that if you think there are many
different types of an obvious or many different types of positive examples.
Then anomaly detection might be more appropriate when there
are many different ways for an aircraft engine to go wrong.
"

CHatGPT4:
Here are the words/phrases that might be mistranscribed and the corrections:

"She is the main difference," - Likely should be "Here is the main difference,"
"an obvious" - This phrase seems out of place and likely is a mistranscription. It could be "anomalies" given the context of the conversation.

So, the corrected transcription could be:

"
Now, even if you have only 20 positive training examples,
it might be okay to apply a supervised learning algorithm.
But it turns out that the way anomaly detection looks at the data set
versus the way supervised learning looks at the data set are quite different.
Here is the main difference, which is that if you think there are many
different types of anomalies or many different types of positive examples.
Then anomaly detection might be more appropriate when there
are many different ways for an aircraft engine to go wrong.
"
[End]

It is quite easy to prompt engineer ChatGPT to output the specific result you want in a format like JSON, which is machine readable, for automation. For example this could be a list of changes, with a diff file for applying each change, for human review. But it seems clear that even without human review, it would be better than the current transscription shown on the video.

I assume that the text to speech generates word probabilities, and alternative word with high probabilities are obviously not available to ChatGPT when used like this. So even better results could be gotten by integrating tighter, giving ChatGPT word likelihoods from the audio.

But the results shown above still seems excellent, far better than nothing. Clearly the ability of a large language model to understand context and subject matter makes all the difference.

That right a large LLM has much better capabilities than a smaller in general. Why dont they use it, coursera has been around for much longer than chatgpt and they probably have contracts and practices that take time to change.

As I understand it, ChatGPT is allowed to be used for commercial use, and I don’t see any legal reason not to use it for this. And the cost to implement it, and API usage cost, would be trivial.

You could always implement ChatGPT corrections seperately on top of the current system - that seems unlikely to interfere with existing contracts?

So best guess at reason for not having already done this is inertia/indifference?