Adding punctuations to YouTube transcripts

Hey there!

I’m interested in adding punctuations to raw YouTube transcripts and developed a proof of concept web app and iOS app here:

AppBlit DOT com/scribe

For now it only works on English transcripts but it seems quite good already.

The neural net is a DistilBert token classifier that was converted to ONNX for inference in the browser using TransformersJS from HuggingFace :hugs:

I would love to make it work in more languages.

Also would like to start summarizing the transcripts and possibly detect speaker turns: it’d add a lot of value when skimming the text, what do you think?