DLS C5 W1 A3 - audio prediction quality

Hello

Thanks a lot for those assignments, I did enjoy the Jazz Solo one.
in 1.1, we load an audio clip which is of good quality. Would it be possible to be more specific on the audio preprocessing code?

then the exercise is being done on midi file with the aim of generating a new midi sequence that is then exported into an audio clip for us to listen to the result.
after that cell, an example of generated audio clip is given as a reference:
what code/dataset shall we use to reach that level of audio quality (which is drastically different (way higher) than the one of the generated we got? The same algorithm? a better preprocessing? if not here could you direct me on where to look at? I’d like to build some models that is/are capable of generating the same audio quality clips than the one provided as a reference at the end of the assignment.
Thanks a lot.

Hi @pat ,

The utility functions used in the pre-processing can be found by opening the file directory:

File → open->
data_util.pys
preprocess.py
music_util.py

If you do a search on the internet, you may find information you need to build your model. Here is one article I found which might be of interest.

Hi Klc,
thanks a lot for you kind answer. I’ll look at the music util file.
thanks also for the article, even thought that does not completely answer my questions:

with the model we worked on in the assignment, can we achieve the level of quality of the generated sample given in reference or do we need more. In the case we would need more, where to go?
thanks a gain a lot already!

The issue with the qualify of the music output in the notebook is with the midi voices that are loaded with the notebook. They’re really low quality.

If you download the midi file and play it on a better midi system with a better voice library, it will sound a lot better.

Note also that the model is simplified to only work with one or two tracks of the original music, so you’re not going to get a full orchestration to match the original samples.

Hi TMosh, thanks, that was my point, where would you lead me to get to work with maybe a more sophisticated one? Asking because I really liked it, but obviously as this is a course, I’m a bit ignorant…

Are you saying that the example generated music was made with the exact same model we built? Because it sounds completely different, not just in quality but in speed and complexity. My “song” is relatively simple and maybe 90 bpm. The example is also super jazzy, while the generated music is not.

I did not write the assignment, but I believe the example was not generated using the model you create in this assignment.

The issue may be that implementing a high-fidelity model would require too much training and take too long for the notebook to create (as this would cost a lot more expensive time on the GPU farm).